[x264-devel] [PATCH] Optionally increment IDR_PIC_ID for IDR frames from a value specified in CLI

Henrik Gramner henrik at gramner.com
Wed Jun 10 02:05:40 CEST 2020


On Tue, Jun 2, 2020 at 1:07 AM Chao Chen <chaoc at netflix.com> wrote:
>
> H.264 streams generated by current x264 encoder flips the IDR_PIC_ID of IDR frames between 0 and 1.
> In this way, the encoded bitstream is compatible with the H.264 specification which requires consecutive
> IDR frames to have different IDR_PIC_IDs.
>
> However, in cloud transcoding and fragmented video streaming applications, different fragments of a video
> are independently encoded and then assembled at clients (e.g. DASH client). In this case, if we encode using
> current x264 implementation, the streams assembled at clients may have consecutive with the same IDR_PIC_ID,
> which violates the specification and could break the decoding process at clients.
>
> Specifically, suppose that x264 encoded two video fragments, each have 5 frames. Both fragments are encoded
> as I(0)-P-I(1)-P-I(0), where IDR_PIC_ID is given by the number in the parenthesis. If a client assemble the
> two fragments toghether, we will have the final stream as I(0)-P-I(1)-P-I(0)-I(0)-P-I(1)-P-I(0). Here, the
> 3rd and 4th frame will have the same IDR_PIC_ID, which violates the specification. We have seen this issue
> could break the H.264 decoder of Edge browser.
>
> To address this limitation, this commit provided a different way to control IDR_PIC_ID in encoded streams. Note
> that IDR_PIC_ID is a 16 bit value ranges from 0-65535. With this commit, users can specify an initial IDR_PIC_ID
> for the first IDR frame and encoder will increment its value for every encoded IDR frame. It provides us with
> the flexibility to avoid IDR_PIC_ID collision in videos assembled at clients. In the above example, we can
> encode two fragments as I(1)-P-I(2)-P-I(3) and I(4)-P-I(5)-P-I(6) so that the concatenated stream is compatible
> with the specification.
>
> We have tested the stream on thousands of devices including IOS, Smart TV, Android and browsers. None of them
> run into decoding errors.
>
> Example usage:
> ./x264 --output testout.264 --init-idr-id 123 --keyint 10 --frames 100 testsrc.y4m
>
> If `--init-idr-id` is not specified, x264 fall back to default behavior, i.e., flip IDR_PIC_ID bwtween 0 and 1.

Hi

Sorry, this somehow ended up in my spam folder.

Thanks for the patch, I agree that this feature would be useful for
that use case.

The reason for the current behavior of alternating between 0 and 1
instead of looping through the entire range of valid values is that
the value requires O(log(n)) bits to encode in the bitstream, so using
smaller values is (marginally) more efficient. Looping through all
values also has the disadvantage of still having a low probability of
collisions if the fragments are long enough. As such, I think it might
be advantageous to use the specified initial value for the first IDR
frame, and then go back to alternating between 0 and 1 afterwards.

Attaching a patch that incorporates this change, plus some other minor
things. Would this work for you guys?

Henrik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: init_idr_pic_id.patch
Type: application/octet-stream
Size: 4921 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/x264-devel/attachments/20200610/97bd94a6/attachment.obj>


More information about the x264-devel mailing list