[vlc-devel] The big one : Frame threading regressions

Thomas Guillem thomas at gllm.fr
Mon Sep 16 10:02:54 CEST 2019



On Sun, Sep 15, 2019, at 21:17, Francois Cartegnie wrote:
> Hi,
> 
> We have lots of users complaining for too long about regressions with
> VLC, and usually we can't pinpoint that regression and usually blame
> hardware decoding.
> 
> We already know low frame rate is an issue in vlc. But It got worse.
> 
> The unpleasant truth is my shiny new 6 cores 12 threads system is
> unable to display raw 30 fps hevc video in software, when my old 2 cores
> one did. All pictures are *late* by a fixed amount.
> 
> Let's start describing few things.
> -----------------------------
> 
> The Core:
> - The picture display date is computed by adding the buffering delay and
> the pts delay, and the pcr delay to the to system converted timestamp.
> - If that date exceeds the current system time, this is considered as a
> late picture, and not displayed.
> - The pcr delay is "extended" by detecting the delay on first timestamp
> conversion (which is decoder output).
> - We can only extend delays. (otherwise we need to implement temporary
> rate change)
> 
> The Packetizer delay:
> - There's an unavoidable delay when we need to wait for the next picture
> to know the limit of the current access unit.
> 
> The Decoder delay:
> - We usually have a push/pull model. Today it is asynchronous.
> - When we pull, it is always triggered by the next incoming block.
> - The next incoming block might also be paced by the clock, or the
> stream itself
> 
> All those delays were compensated by the >= 300ms delay we set as
> pts-delay which is also buffering and the pcr delay extension done by
> the core.
> 
> 
> In my 30fps hevc case
> --------------------
> 
> The number of avcodec threads creates a global frame output delay which
> depends directly of the number of threads (and of course of the GOP
> references between the frames). With 10 threads, the 25fps video is now
> unable to playback, the total output delay being > 300ms.
> Why does pcr delay extension fails ? First (IFrame) picture
> outputs faster (no refs).
> 
> 
> Why was it working before ?
> -------------------------
> 
> In 2.x and 3.0 we gradually introduced changes:
> 
> * We changed the synchronous avcodec decoder push/pullwait model to an
> asynchronous push/pull.
> Potentially we increased the delay when the source is paced. This is
> also the case depending on fps and PTS<->PCR delay.
> If you have an audio stream, you also have a race with the first
> converted timestamp, which then is guaranteed to set up a lowest, too
> small, extended delay.
> 
> * We enabled threading in avcodec for H264: Frame Threading, which
> creates delay (this was documented !).
> Slice threading is also available and has less delay overhead but we did
> not enable it because this is not hardware decoding friendly.
> 
> 
> Low delay considerations (specific case)
> -----------------------
> The opposite case of the described problem is when you try to do low
> delay. (Your GOP is usually intra in that case).
> Your first picture will always output later than every other picture,
> because of the decoder startup time.
> 
> 
> So what ?
> ---------
> 

We should always use the maximum number of threads available, but lower it if the user wants low-delay.

> There's few ways I can think about to fix the main issue
> - Have a way to report decoder delay.. but that would mean no playback
> until data decodes (what if only bogus data ? mutiple decs ?..).
> - Bump default pts-delay for now.
> - Adapt pts-delay based on number of threads.
> - Rewrite the core to be able to add delay without rebuffering. I don't
> see how that's doable: that's similar to implementing delay reduction.

That might be doable with the new output clock, but
 - The audio will have control on the delay. When the first audio block is rendered, the play date is in the future and corresponds to the output delay. When this first audio block is played, this date can't change anymore. So if the video output want to change it in the meantime, it won't have any effects.

So, if the video want to impact the output clock delay, it must do something before the first audio is played but that's impossible to assert since audio/video have their own threads. That's a small conception fail, I think it can be easily fixed though.

> - Kill frame threading based on a number of threads and fps.
> 
> That's not fun at all.
> 
> François
> _______________________________________________
> vlc-devel mailing list
> To unsubscribe or modify your subscription options:
> https://mailman.videolan.org/listinfo/vlc-devel


More information about the vlc-devel mailing list