<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

  <meta http-equiv="Content-Style-Type" content="text/css" />

  <meta name="generator" content="pandoc" />

  <title></title>

  <style type="text/css">code{white-space: pre;}</style>

</head>

<body>

<p>Hi Rémi,</p>

<p>On 2016-10-24 16:02, remi@remlab.net wrote:</p>

<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;color:#500050">

<pre><code> Hello,

 I don't follow. The "penalty" of block-based access to stream-based

 demux is a memory copy. It is intrinsic.

 </code></pre>

<p>I am not talking about the <em>memory copy</em> taking place in order to copy data from the generated block into the destination buffer referred to by an invocation of <code>vlc_stream_Read</code> (and related functions), that is (as you said) something intrinsic to the mechanisms in play.</p>

</blockquote>

<p>I am talking about the <em>memory copy/write</em> that inherently must take place in order for a block-based content <em>“generator”</em> to create a <code>block_t</code> that is then living inside <code>src/input/stream.c</code>.</p>

<pre><code>[ block access ] -> [ block cache ] -> [core] -> [ stream reader ]

        ^- A                ^- B                       ^- C

/**

 * A + B is somewhat redundant to the example, we could have left

 * out B and have the same behavior, but it is added to better

 * reflect how the previous discussion.

 *

 * One should however keep in mind that there might be any number

 * of stream_filters between "[core]" and "[stream reader]", and

 * if these are block-based things get worse.

 **/</code></pre>

<p>In the above <code>B</code> is simply a cache that will cache all the blocks that are generated by <code>A</code>, the <code>[core]</code> is the implementation within <code>src/input/stream.c</code>, and <code>C</code> is something reading using <code>vlc_stream_Read</code>.</p>

<p>Imagine we are att offset zero, and <code>B</code> generates blocks <code>1 MB</code> in size, a block having <code>p_block->i_buffer = 1024*1024</code> would end up in <code>[core</code>].</p>

<p>If <code>C</code> wants to read <code>256</code> bytes at offset zero, and then <code>256 bytes</code> at offset <code>524 288</code> it would (probably) do something as the below (even though it could use skipping, but it has no way of knowing that <code>B</code> has <code>N</code> as “chunk size”):</p>

<pre><code> vlc_stream_Read( src, buf_1, 256 ); /* (1) */

 vlc_stream_Seek( src, 524288 );     /* (2) */

 vlc_stream_Read( src, buf_2, 256 ); /* (3) */</code></pre>

<p>Given that we currently, unconditionally, invalidate read blocks in <code>[core]</code> during <code>vlc_stream_Seek</code>, at <code>(3)</code> we would have to ask <code>B</code> to create a new block which the <code>[core]</code> transparently will use to feed <code>C</code>.</p>

<p>In other words, <code>B</code> will “regenerate” contents that was already available at <code>(1)</code>, and for <em>accessors/stream_filters</em> that can only fetch chunks of <code>N</code> bytes of data upstream, it would have to fetch <code>N</code> bytes, even though reading stream is only interested in a smaller subset.</p>

<p>Now imagine that <code>C</code> is seeking back and forth within a single block;</p>

<ul>

<li>why not have the block live inside <code>[core]</code> until it is definitely exhausted, instead of asking <code>B</code> (or <code>A</code>) to generate new contents only to discard it?</li>

</ul>

<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;color:#500050">

<pre><code> Furthermore, the micro-optimization in this patch does not remove or

 add any memory copy as such.</code></pre>

</blockquote>

<p>The <em>“micro-optimization”</em> gets rid of asking the source, <em>block-based</em>, stream to give data that was already available prior to <code>vlc_stream_Seek</code>.</p>

<p>I could provide a <em>testcase</em> and benchmark if this is wanted/requested.</p>

<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;color:#500050">

<pre><code> Le 24 oct. 2016 2:21 PM, Filip Roséen <filip@atch.se> a écrit :

   Hi Rémi,

   I honestly feel like you are misunderstanding what I am trying to

   convay, if that is me being vague in my explanation or if we are

   just talking different languages - I do not know.

   Given that we have a transparent proxy going from block-based

   reading to what most people associate with “stream”, it is very

   unfortunate that block-based accessors suffer a penalty because of

   that.

   The reading on the “output” end should not have to worry about the

   resource which it is reading data from, and should certainly /not/

   care whether the input-side is a _block-_ or _stream-based_ stream.

   In my opinion it does not make any sense for us to copy the same

   data over and over, no matter if one issues a seek or does a proper

   read (skip) into a NULL-buffer.

          Le maanantaina 24. lokakuuta 2016, 10.38.03 EEST Filip Roséen a écrit :

            Hi Remi

            On 2016-10-24 11:33, Rémi Denis-Courmont wrote:

              Le maanantaina 24. lokakuuta 2016, 2.13.11 EEST Filip Roséen a écrit :

                If a seek request happens to refer to a position within the already

                read block, simply update the block so that it refers to the desired

                data.

                There is really no need for us to discard the block, only to ask the

                underlying stream to give us a new one (when we have the data we want

                already).

              I don´t see the point. I have spent a lot of time untangling the caching

              from the generic byte stream reader, and this is just starting to bring

              it back in and duplicate functionality.

            I agree that we should keep the implementation simple and consise, and

            not duplicate functionality; but having the kind of support proposed

            is just a extremely minor subset of the functionality within other

            parts of the code..

          Any reasonable cache implementation will optimize a short forward skip or 

          read. That is the whole point of read-ahead caching. So this is completely 

          pointless complexity.

   This is besides the point.

            and I think it make sense not to re-read data

            (just as we have the functionality in place for peeking).

          Yes. That´s what VLC and operating systems have caches for.

   VLC has a conditional cache, and when these cache modules not apply,

   performance suck; that’s it.

   And on that note, the current cache mechanisms are applied too early

   in the chain - given that _stream_filter_ (applied after the

   _accessor_ is created) cannot take advantage of the cache-modules,

   reading from a block-based _stream_filter_ suffers from the “there

   is no cache”-flaw, no matter what.

            One could even merge the peeking and block "cache" in

            `src/input/stream.c`, making the implementation even less complicated

            than it is now.

          Support for peek and block is already in-there. Nothing else needs to be 

          merged, and anything more will obviously make the code more and needlessly 

          complicated.

   Support for peeking exists, the support of block in this regard is

   certainly not as good as it could be (and given that you just said

   that such complexity is not supposed to be there, you inherently say

   that support for not copying the same data over and over is not

   there).

              If you need to skip forward, call vlc_stream_Read(..., NULL, ...) in the

              demuxer or wherever else needed. That will work even if the stream is not

              readable.

            I guess we should have a major clean-up session, because we have

            demuxers (such as `modules/demux/mp4` and `modules/demux/avi`), and

            other entities, that does a lot of seeking back and forth, and for

            block-based accessors this is killing performance (really bad).

          This is FUD. Bringing cache_block back instead of prefetch will not improve 

          uninterleaved file playback. cache_block has no support for multiple tracks, 

          and never had. It is much worse than prefetch for that purpose.

          Besides, reading uninterleaved AVI through a network has always sucked in VLC. 

          Even if you remove the prefetch and new HTTP plugins, cache_stream's support 

          for multiple tracks will only fix it partially (and it will break what 

          prefetch fixed such as high-bandwidth non-seeking streams). We know that from 

          old user feedback.

          There is only one definite way to fix playback of uninterleaved tracks: set up 

          one byte stream per track (at least per active track). Using multiple tracks 

          with a single stream is mathematically bound to eventually fail:

          Take:

          - T the number of active tracks or read offsets,

          - N the available network bandwidth,

          - S the stream bitrate, and

          - L the seek latency.

          You need to accumulate buffers of T * S  * L and then some, only to compensate 

          for seeking. You can build up that buffer at best at a rate of: N - S.

          So you need to buffer for a duration of at least: T * S * L / (N - S)

          And allocate corresponding buffer space of: T * S² * L / (N - S)

          Note that this goes out-of-hand if N nears S, or if S is large. And that 

          assumes you can even estimate those values.

          As for a pure random access formatted file, you have to play from a random 

          access media. It is unfixable at software level.

   I am well aware of the above, but I honestly feel like bringing up

   such explanation of how things work is drawing attention away from

   the real issues:

   -   We are copying data over and over if you happen to seek within

       what is the same block for a _block-based_ stream.

   -   Given that the _reader-side_ cannot know the size of the block

       coming from the _input-side_ (unless it is reading blocks):

   -   How can it know that a _read-skip_ is better than seek (given

       that the block size can be arbitrary)?

   -   Should we force implementations not to issue seek-requests at

       all, and just rely on skipping?

   -   Why not make the skipping and seeking be based on the same

       implementation? It is semantically the same thing.

   This is not a matter of adding complexity, and as I already said

   rewriting the relevant part of src/input/stream.c so that the

   functionality is shared would even _simplify_ the implementation

   (not the opposite).

   I do not know what else to add, I am honestly quite baffled about

   the negativity received;

   -   what do you propose to fix the problem, or;

   -   are you saying that there is no problem to be fixed?

   Best Regards,

   Filip

            Sure, they can be fixed to not discard already read data, but I find

            that having simple management of blocks within `src/input/stream.c` is

            not too complex.</code></pre>

</blockquote>

<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;color:#500050">

<pre><code> _______________________________________________

 vlc-devel mailing list

 To unsubscribe or modify your subscription options:

 https://mailman.videolan.org/listinfo/vlc-devel</code></pre>

</blockquote>

</body>

</html>