<div dir="ltr">Hi, Henrik<div>Thanks for reviewing my patch and helping improve it.</div><div><br></div><div>I looked into your patch and thought about your idea carefully. I think, however, only using 0 and 1 for the rest of the encoded stream won't address our use cases. We have some use cases where we need to encode a series of IDR frames and user clients may splice the encoded streams at those IDRs.  For example, if we encoded a stream like I(4)-P-P-I(0)-I(1)-P-P-I(0)-P-P where the number in parentheses is the IDR_PIC_ID, when a client tries to splice the 4th IDR frame and the following frames after the 2nd IDR frame, we will create a stream like I(4)-P-P-I(0)-I(0)-P-P, which created a IDR_PIC_ID collison.</div><div><br></div><div>I agree with you that allowing IDR_PIC_ID to use full value range could increase the size of the encoded bitstream. But I think the impact is probably negligible as the number of IDR frames in a stream is typically very small. In our AB test, we did not see any impact on streaming QoE metrics such as PSNR, VMAF, streaming latency etc. Also, if `init-idr-id` is not set, x264 will stay backward compatible and produce 0-1 IDR_PIC_IDs. Therefore, this change will not affect all existing users.</div><div><br></div><div>Please let me know if that makes sense.</div><div><br></div><div>best,</div><div>Chao Chen</div><div><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 9, 2020 at 5:06 PM Henrik Gramner <<a href="mailto:henrik@gramner.com">henrik@gramner.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Jun 2, 2020 at 1:07 AM Chao Chen <<a href="mailto:chaoc@netflix.com" target="_blank">chaoc@netflix.com</a>> wrote:<br>

><br>

> H.264 streams generated by current x264 encoder flips the IDR_PIC_ID of IDR frames between 0 and 1.<br>

> In this way, the encoded bitstream is compatible with the H.264 specification which requires consecutive<br>

> IDR frames to have different IDR_PIC_IDs.<br>

><br>

> However, in cloud transcoding and fragmented video streaming applications, different fragments of a video<br>

> are independently encoded and then assembled at clients (e.g. DASH client). In this case, if we encode using<br>

> current x264 implementation, the streams assembled at clients may have consecutive with the same IDR_PIC_ID,<br>

> which violates the specification and could break the decoding process at clients.<br>

><br>

> Specifically, suppose that x264 encoded two video fragments, each have 5 frames. Both fragments are encoded<br>

> as I(0)-P-I(1)-P-I(0), where IDR_PIC_ID is given by the number in the parenthesis. If a client assemble the<br>

> two fragments toghether, we will have the final stream as I(0)-P-I(1)-P-I(0)-I(0)-P-I(1)-P-I(0). Here, the<br>

> 3rd and 4th frame will have the same IDR_PIC_ID, which violates the specification. We have seen this issue<br>

> could break the H.264 decoder of Edge browser.<br>

><br>

> To address this limitation, this commit provided a different way to control IDR_PIC_ID in encoded streams. Note<br>

> that IDR_PIC_ID is a 16 bit value ranges from 0-65535. With this commit, users can specify an initial IDR_PIC_ID<br>

> for the first IDR frame and encoder will increment its value for every encoded IDR frame. It provides us with<br>

> the flexibility to avoid IDR_PIC_ID collision in videos assembled at clients. In the above example, we can<br>

> encode two fragments as I(1)-P-I(2)-P-I(3) and I(4)-P-I(5)-P-I(6) so that the concatenated stream is compatible<br>

> with the specification.<br>

><br>

> We have tested the stream on thousands of devices including IOS, Smart TV, Android and browsers. None of them<br>

> run into decoding errors.<br>

><br>

> Example usage:<br>

> ./x264 --output testout.264 --init-idr-id 123 --keyint 10 --frames 100 testsrc.y4m<br>

><br>

> If `--init-idr-id` is not specified, x264 fall back to default behavior, i.e., flip IDR_PIC_ID bwtween 0 and 1.<br>

<br>

Hi<br>

<br>

Sorry, this somehow ended up in my spam folder.<br>

<br>

Thanks for the patch, I agree that this feature would be useful for<br>

that use case.<br>

<br>

The reason for the current behavior of alternating between 0 and 1<br>

instead of looping through the entire range of valid values is that<br>

the value requires O(log(n)) bits to encode in the bitstream, so using<br>

smaller values is (marginally) more efficient. Looping through all<br>

values also has the disadvantage of still having a low probability of<br>

collisions if the fragments are long enough. As such, I think it might<br>

be advantageous to use the specified initial value for the first IDR<br>

frame, and then go back to alternating between 0 and 1 afterwards.<br>

<br>

Attaching a patch that incorporates this change, plus some other minor<br>

things. Would this work for you guys?<br>

<br>

Henrik<br>

_______________________________________________<br>

x264-devel mailing list<br>

<a href="mailto:x264-devel@videolan.org" target="_blank">x264-devel@videolan.org</a><br>

<a href="https://mailman.videolan.org/listinfo/x264-devel" rel="noreferrer" target="_blank">https://mailman.videolan.org/listinfo/x264-devel</a><br>

</blockquote></div>