[vlc-devel] [PATCH] input/stream: do not unconditionally invalidate block on seek

Mon Oct 24 13:21:59 CEST 2016

Hi Rémi,

I honestly feel like you are misunderstanding what I am trying to
convay, if that is me being vague in my explanation or if we are just
talking different languages - I do not know.

Given that we have a transparent proxy going from block-based reading
to what most people associate with "stream", it is very unfortunate
that block-based accessors suffer a penalty because of that.

The reading on the "output" end should not have to worry about the
resource which it is reading data from, and should certainly /not/
care whether the input-side is a *block-* or *stream-based* stream.

In my opinion it does not make any sense for us to copy the same data
over and over, no matter if one issues a seek or does a proper read
(skip) into a `NULL`-buffer.

> Le maanantaina 24. lokakuuta 2016, 10.38.03 EEST Filip Roséen a écrit :
> > Hi Remi
> > 
> > On 2016-10-24 11:33, Rémi Denis-Courmont wrote:
> > > Le maanantaina 24. lokakuuta 2016, 2.13.11 EEST Filip Roséen a écrit :
> > > > If a seek request happens to refer to a position within the already
> > > > read block, simply update the block so that it refers to the desired
> > > > data.
> > > > 
> > > > There is really no need for us to discard the block, only to ask the
> > > > underlying stream to give us a new one (when we have the data we want
> > > > already).
> > > 
> > > I don´t see the point. I have spent a lot of time untangling the caching
> > > from the generic byte stream reader, and this is just starting to bring
> > > it back in and duplicate functionality.
> > 
> > I agree that we should keep the implementation simple and consise, and
> > not duplicate functionality; but having the kind of support proposed
> > is just a extremely minor subset of the functionality within other
> > parts of the code..
> 
> Any reasonable cache implementation will optimize a short forward skip or 
> read. That is the whole point of read-ahead caching. So this is completely 
> pointless complexity.

This is besides the point.

> > and I think it make sense not to re-read data
> > (just as we have the functionality in place for peeking).
> 
> Yes. That´s what VLC and operating systems have caches for.

VLC has a conditional cache, and when these cache modules not apply,
performance suck; that's it.

And on that note, the current cache mechanisms are applied too early
in the chain - given that *stream_filter* (applied after the
*accessor* is created) cannot take advantage of the cache-modules,
reading from a block-based *stream_filter* suffers from the "there is
no cache"-flaw, no matter what.

> > One could even merge the peeking and block "cache" in
> > `src/input/stream.c`, making the implementation even less complicated
> > than it is now.
> 
> Support for peek and block is already in-there. Nothing else needs to be 
> merged, and anything more will obviously make the code more and needlessly 
> complicated.

Support for peeking exists, the support of block in this regard is
certainly not as good as it could be (and given that you just said
that such complexity is not supposed to be there, you inherently say
that support for not copying the same data over and over is not there).

> > > If you need to skip forward, call vlc_stream_Read(..., NULL, ...) in the
> > > demuxer or wherever else needed. That will work even if the stream is not
> > > readable.
> > 
> > I guess we should have a major clean-up session, because we have
> > demuxers (such as `modules/demux/mp4` and `modules/demux/avi`), and
> > other entities, that does a lot of seeking back and forth, and for
> > block-based accessors this is killing performance (really bad).
> 
> This is FUD. Bringing cache_block back instead of prefetch will not improve 
> uninterleaved file playback. cache_block has no support for multiple tracks, 
> and never had. It is much worse than prefetch for that purpose.
> 
> Besides, reading uninterleaved AVI through a network has always sucked in VLC. 
> Even if you remove the prefetch and new HTTP plugins, cache_stream's support 
> for multiple tracks will only fix it partially (and it will break what 
> prefetch fixed such as high-bandwidth non-seeking streams). We know that from 
> old user feedback.
> 
> There is only one definite way to fix playback of uninterleaved tracks: set up 
> one byte stream per track (at least per active track). Using multiple tracks 
> with a single stream is mathematically bound to eventually fail:
> 
> Take:
> - T the number of active tracks or read offsets,
> - N the available network bandwidth,
> - S the stream bitrate, and
> - L the seek latency.
> 
> You need to accumulate buffers of T * S  * L and then some, only to compensate 
> for seeking. You can build up that buffer at best at a rate of: N - S.
> 
> So you need to buffer for a duration of at least: T * S * L / (N - S)
> And allocate corresponding buffer space of: T * S² * L / (N - S)
> 
> Note that this goes out-of-hand if N nears S, or if S is large. And that 
> assumes you can even estimate those values.
> 
> As for a pure random access formatted file, you have to play from a random 
> access media. It is unfixable at software level.
> 

I am well aware of the above, but I honestly feel like bringing up
such explanation of how things work is drawing attention away from
the real issues:

 - We are copying data over and over if you happen to seek within what
   is the same block for a *block-based* stream.

 - Given that the *reader-side* cannot know the size of the block
   coming from the *input-side* (unless it is reading blocks):

   - How can it know that a *read-skip* is better than seek (given
     that the block size can be arbitrary)?

   - Should we force implementations not to issue seek-requests at
     all, and just rely on skipping?

   - Why not make the skipping and seeking be based on the same
     implementation? It is semantically the same thing.

This is not a matter of adding complexity, and as I already said
rewriting the relevant part of `src/input/stream.c` so that the
functionality is shared would even *simplify* the implementation (not
the opposite).

I do not know what else to add, I am honestly quite baffled about the
negativity received; 

 - what do you propose to fix the problem, or;
 - are you saying that there is no problem to be fixed?

Best Regards,\
Filip

> > Sure, they can be fixed to not discard already read data, but I find
> > that having simple management of blocks within `src/input/stream.c` is
> > not too complex.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20161024/37d2287a/attachment.html>