[x264-devel] [PATCH] Optionally increment IDR_PIC_ID for IDR frames from a value specified in CLI

Chao Chen chaoc at netflix.com
Wed Jun 10 23:51:20 CEST 2020


Hi, Henrik
Thanks for reviewing my patch and helping improve it.

I looked into your patch and thought about your idea carefully. I think,
however, only using 0 and 1 for the rest of the encoded stream won't
address our use cases. We have some use cases where we need to encode a
series of IDR frames and user clients may splice the encoded streams at
those IDRs.  For example, if we encoded a stream like
I(4)-P-P-I(0)-I(1)-P-P-I(0)-P-P where the number in parentheses is the
IDR_PIC_ID, when a client tries to splice the 4th IDR frame and the
following frames after the 2nd IDR frame, we will create a stream
like I(4)-P-P-I(0)-I(0)-P-P, which created a IDR_PIC_ID collison.

I agree with you that allowing IDR_PIC_ID to use full value range could
increase the size of the encoded bitstream. But I think the impact is
probably negligible as the number of IDR frames in a stream is typically
very small. In our AB test, we did not see any impact on streaming QoE
metrics such as PSNR, VMAF, streaming latency etc. Also, if `init-idr-id`
is not set, x264 will stay backward compatible and produce 0-1 IDR_PIC_IDs.
Therefore, this change will not affect all existing users.

Please let me know if that makes sense.

best,
Chao Chen




On Tue, Jun 9, 2020 at 5:06 PM Henrik Gramner <henrik at gramner.com> wrote:

> On Tue, Jun 2, 2020 at 1:07 AM Chao Chen <chaoc at netflix.com> wrote:
> >
> > H.264 streams generated by current x264 encoder flips the IDR_PIC_ID of
> IDR frames between 0 and 1.
> > In this way, the encoded bitstream is compatible with the H.264
> specification which requires consecutive
> > IDR frames to have different IDR_PIC_IDs.
> >
> > However, in cloud transcoding and fragmented video streaming
> applications, different fragments of a video
> > are independently encoded and then assembled at clients (e.g. DASH
> client). In this case, if we encode using
> > current x264 implementation, the streams assembled at clients may have
> consecutive with the same IDR_PIC_ID,
> > which violates the specification and could break the decoding process at
> clients.
> >
> > Specifically, suppose that x264 encoded two video fragments, each have 5
> frames. Both fragments are encoded
> > as I(0)-P-I(1)-P-I(0), where IDR_PIC_ID is given by the number in the
> parenthesis. If a client assemble the
> > two fragments toghether, we will have the final stream as
> I(0)-P-I(1)-P-I(0)-I(0)-P-I(1)-P-I(0). Here, the
> > 3rd and 4th frame will have the same IDR_PIC_ID, which violates the
> specification. We have seen this issue
> > could break the H.264 decoder of Edge browser.
> >
> > To address this limitation, this commit provided a different way to
> control IDR_PIC_ID in encoded streams. Note
> > that IDR_PIC_ID is a 16 bit value ranges from 0-65535. With this commit,
> users can specify an initial IDR_PIC_ID
> > for the first IDR frame and encoder will increment its value for every
> encoded IDR frame. It provides us with
> > the flexibility to avoid IDR_PIC_ID collision in videos assembled at
> clients. In the above example, we can
> > encode two fragments as I(1)-P-I(2)-P-I(3) and I(4)-P-I(5)-P-I(6) so
> that the concatenated stream is compatible
> > with the specification.
> >
> > We have tested the stream on thousands of devices including IOS, Smart
> TV, Android and browsers. None of them
> > run into decoding errors.
> >
> > Example usage:
> > ./x264 --output testout.264 --init-idr-id 123 --keyint 10 --frames 100
> testsrc.y4m
> >
> > If `--init-idr-id` is not specified, x264 fall back to default behavior,
> i.e., flip IDR_PIC_ID bwtween 0 and 1.
>
> Hi
>
> Sorry, this somehow ended up in my spam folder.
>
> Thanks for the patch, I agree that this feature would be useful for
> that use case.
>
> The reason for the current behavior of alternating between 0 and 1
> instead of looping through the entire range of valid values is that
> the value requires O(log(n)) bits to encode in the bitstream, so using
> smaller values is (marginally) more efficient. Looping through all
> values also has the disadvantage of still having a low probability of
> collisions if the fragments are long enough. As such, I think it might
> be advantageous to use the specified initial value for the first IDR
> frame, and then go back to alternating between 0 and 1 afterwards.
>
> Attaching a patch that incorporates this change, plus some other minor
> things. Would this work for you guys?
>
> Henrik
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org
> https://mailman.videolan.org/listinfo/x264-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x264-devel/attachments/20200610/469c1aeb/attachment-0001.html>


More information about the x264-devel mailing list