[x264-devel] [PATCH] Added support for CABAC zero bytes insertion
Loren Merritt
lorenm at u.washington.edu
Wed Apr 17 23:22:10 CEST 2019
On Fri, 12 Apr 2019, Jay N. Shingala wrote:
> Dear x264 developers,
>
> This query is regarding the patch submitted (quite a while ago) on CABAC zero word insertion which is a requirement for bit-stream conformance.
>
> For reference, here is an excerpt of section 7.4.2.10 of AVC/H.264 specification describing the need for zero word insertion when the CABAC bin count to bits count ratio is higher than constrained limits.
>
> "cabac_zero_word is a byte-aligned sequence of two bytes equal to 0x0000.
>
> Let NumBytesInVclNALunits be the sum of the values of NumBytesInNALunit for all VCL NAL units of a coded picture
>
> Let BinCountsInNALunits be the number of times that the parsing process function DecodeBin( ), specified in
> clause 9.3.3.2, is invoked to decode the contents of all VCL NAL units of a coded picture. When
> entropy_coding_mode_flag is equal to 1, it is a requirement of bitstream conformance that BinCountsInNALunits shall
> not exceed ( 32 ÷ 3 ) * NumBytesInVclNALunits + ( RawMbBits * PicSizeInMbs ) ÷ 32.
>
> NOTE - The constraint on the maximum number of bins resulting from decoding the contents of the slice layer NAL units can be
> met by inserting a number of cabac_zero_word syntax elements to increase the value of NumBytesInVclNALunits. Each
> cabac_zero_word is represented in a NAL unit by the three-byte sequence 0x000003 (as a result of the constraints on NAL unit
> contents that result in requiring inclusion of an emulation_prevention_three_byte for each cabac_zero_word)."
>
> This patch will be useful for strict bit stream conformance in x264.
> It is important to note that the overall performance impact was negligible as the latency cycle of "bin_cnt" incrementing in cabac_encode_decision() and cabac_encode_bypass() is well hidden.
>
> Request you to please provide comments on the conformance requirement and the suitability of this patch in x264.
I can sorta explain why it exists in the standard, and why I always
ignored that clause of the standard.
A Level is a bundle of "if you want to decode worst-case examples of
streams with this Level, your decoder had better have resources XYZ". One
of the resources that is needed, is the throughput of the cabac decoder.
For some reason I never saw explained, the standard didn't include cabac
directly as a Level parameter, instead it uses bitrate and resolution as a
proxy. Which is usually fine; bitrate is a pretty good proxy for number of
cabac decisions.
But for some unusual streams, that get an unusually large amount of
compression benefit from cabac, then that assumed relation can fail. (I
don't know off-hand how rare this is.) And if such a condition is combined
with large absolute bitrate, then the decoder can be left processing
slightly more cabac decisions than its Level promised it to be capable of.
The standard's perverse solution for this edge case is: Tell the encoder
to stop compressing so well, so that the padded bitrate resumes being a
good proxy for the number of cabac decisions. And it doesn't tell you to
just virtually pad it for the purpose of checking Level compliance, it
tells you to actually make the actual compression worse.
All of that could have been an off-by-default option for pedantic
standard-correctness that someone could enable if they ever found a
use-case where it matters. But implementing it requires an extra
instruction in the cabac inner loop, that you don't get to skip just
because you turn off the padding. Which is only a tiny speed cost, but is
a cost I decided not to pay for a feature whose "benefit" is "occasionally
make the compression worse".
--Loren Merritt
More information about the x264-devel
mailing list