<div dir="ltr">Hi, Henrik<div>Thanks for reviewing my patch and helping improve it.</div><div><br></div><div>I looked into your patch and thought about your idea carefully. I think, however, only using 0 and 1 for the rest of the encoded stream won't address our use cases. We have some use cases where we need to encode a series of IDR frames and user clients may splice the encoded streams at those IDRs. For example, if we encoded a stream like I(4)-P-P-I(0)-I(1)-P-P-I(0)-P-P where the number in parentheses is the IDR_PIC_ID, when a client tries to splice the 4th IDR frame and the following frames after the 2nd IDR frame, we will create a stream like I(4)-P-P-I(0)-I(0)-P-P, which created a IDR_PIC_ID collison.</div><div><br></div><div>I agree with you that allowing IDR_PIC_ID to use full value range could increase the size of the encoded bitstream. But I think the impact is probably negligible as the number of IDR frames in a stream is typically very small. In our AB test, we did not see any impact on streaming QoE metrics such as PSNR, VMAF, streaming latency etc. Also, if `init-idr-id` is not set, x264 will stay backward compatible and produce 0-1 IDR_PIC_IDs. Therefore, this change will not affect all existing users.</div><div><br></div><div>Please let me know if that makes sense.</div><div><br></div><div>best,</div><div>Chao Chen</div><div><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 9, 2020 at 5:06 PM Henrik Gramner <<a href="mailto:henrik@gramner.com">henrik@gramner.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Jun 2, 2020 at 1:07 AM Chao Chen <<a href="mailto:chaoc@netflix.com" target="_blank">chaoc@netflix.com</a>> wrote:<br>
><br>
> H.264 streams generated by current x264 encoder flips the IDR_PIC_ID of IDR frames between 0 and 1.<br>
> In this way, the encoded bitstream is compatible with the H.264 specification which requires consecutive<br>
> IDR frames to have different IDR_PIC_IDs.<br>
><br>
> However, in cloud transcoding and fragmented video streaming applications, different fragments of a video<br>
> are independently encoded and then assembled at clients (e.g. DASH client). In this case, if we encode using<br>
> current x264 implementation, the streams assembled at clients may have consecutive with the same IDR_PIC_ID,<br>
> which violates the specification and could break the decoding process at clients.<br>
><br>
> Specifically, suppose that x264 encoded two video fragments, each have 5 frames. Both fragments are encoded<br>
> as I(0)-P-I(1)-P-I(0), where IDR_PIC_ID is given by the number in the parenthesis. If a client assemble the<br>
> two fragments toghether, we will have the final stream as I(0)-P-I(1)-P-I(0)-I(0)-P-I(1)-P-I(0). Here, the<br>
> 3rd and 4th frame will have the same IDR_PIC_ID, which violates the specification. We have seen this issue<br>
> could break the H.264 decoder of Edge browser.<br>
><br>
> To address this limitation, this commit provided a different way to control IDR_PIC_ID in encoded streams. Note<br>
> that IDR_PIC_ID is a 16 bit value ranges from 0-65535. With this commit, users can specify an initial IDR_PIC_ID<br>
> for the first IDR frame and encoder will increment its value for every encoded IDR frame. It provides us with<br>
> the flexibility to avoid IDR_PIC_ID collision in videos assembled at clients. In the above example, we can<br>
> encode two fragments as I(1)-P-I(2)-P-I(3) and I(4)-P-I(5)-P-I(6) so that the concatenated stream is compatible<br>
> with the specification.<br>
><br>
> We have tested the stream on thousands of devices including IOS, Smart TV, Android and browsers. None of them<br>
> run into decoding errors.<br>
><br>
> Example usage:<br>
> ./x264 --output testout.264 --init-idr-id 123 --keyint 10 --frames 100 testsrc.y4m<br>
><br>
> If `--init-idr-id` is not specified, x264 fall back to default behavior, i.e., flip IDR_PIC_ID bwtween 0 and 1.<br>
<br>
Hi<br>
<br>
Sorry, this somehow ended up in my spam folder.<br>
<br>
Thanks for the patch, I agree that this feature would be useful for<br>
that use case.<br>
<br>
The reason for the current behavior of alternating between 0 and 1<br>
instead of looping through the entire range of valid values is that<br>
the value requires O(log(n)) bits to encode in the bitstream, so using<br>
smaller values is (marginally) more efficient. Looping through all<br>
values also has the disadvantage of still having a low probability of<br>
collisions if the fragments are long enough. As such, I think it might<br>
be advantageous to use the specified initial value for the first IDR<br>
frame, and then go back to alternating between 0 and 1 afterwards.<br>
<br>
Attaching a patch that incorporates this change, plus some other minor<br>
things. Would this work for you guys?<br>
<br>
Henrik<br>
_______________________________________________<br>
x264-devel mailing list<br>
<a href="mailto:x264-devel@videolan.org" target="_blank">x264-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x264-devel" rel="noreferrer" target="_blank">https://mailman.videolan.org/listinfo/x264-devel</a><br>
</blockquote></div>