[x264-devel] Behaviour of Annex B encoding: bug or not?

Thu Jan 14 18:23:25 CET 2010

Ah, now I understand what you meant. Yes, that's basically what we're 
currently doing.

Thanks!

- Philip

On Tue, 12 Jan 2010, Sergey A. Sablin wrote:

> I've meant that for this particular application - specific RTP only decoder, 
> no remuxing is needed at any point during transmission or after receiving, 
> that is no other standard compliant decoder would be involved in the decoding 
> process, emulation prevention is simply redundant feature. So encoder 
> targeting this particular specific application can skip emulation prevention 
> process.
>
> This behavior will be out of the specification scope and hence on your own 
> risk. Specification is exactly here to ensure interoperability between 
> different implementations and in different situations. So in general this 
> decoder is broken and not suitable for wide usage as it doesn't follow spec, 
> but you can tune your pipeline (ie encoder) to work this out without losing 
> any information.
>
> Sergey.
>
>
> Philip Spencer wrote:
>>  On Wed, 6 Jan 2010, Sergey A. Sablin wrote:
>> 
>> >  Another question is that in a case of RTP only decoders don't need 
>> >  emulation prevention at all - and if no remuxing to TS or byte stream 
>> >  format is needed, then emulation prevention process can be safely 
>> >  skipped.
>> > 
>> >  Sergey.
>>
>>  Can you clarify what you mean by that? That's exactly the situation we
>>  have (our hardware has an RTP-only decoder, and it doesn't do emulation
>>  prevention so it doesn't remove the emulation prevention bytes), but it
>>  certainly seems that's a problem, because then it chokes on packets
>>  generated by software like x264 which does properly do emulation
>>  prevention! So I'm not sure how the emulation prevention process can ever
>>  be "safely skipped" ...
>>
>>  - Philip
>>
>>  --------------------------------------------+------------------------------- 
>>
>>  Philip Spencer  pspencer at fields.utoronto.ca | Director of Computing
>>  Services
>>  Room 336        (416)-348-9710  ext3036     | The Fields Institute for
>>  222 College St, Toronto ON M5T 3J1 Canada   | Research in Mathematical
>>  Sciences
>> 
>> > 
>> > 
>> >  Alex Giladi wrote:
>> > >   Gil,
>> > >   Quoting section 3 of the RFC:  "This payload specification can only 
>> > >   be
>> > >   used to carry the "naked" H.264 NAL unit stream over RTP, and not the
>> > >   bitstream format discussed in Annex B of H.264". Escaping is defined
>> > >   in Annex B (section B.1).
>> > >   Haven't played with this (I work with plain old MPEG-2 TS), but this
>> > >   is my interpretation of the specs.
>> > >   Also, there is no "de-facto standard": there is only compliance or
>> > >   lack of compliance.
>> > >   Alex.
>> > > 
>> > >   On Tue, Jan 5, 2010 at 5:44 PM, Gil Pedersen <gil at cmi.aau.dk> wrote:
>> > > 
>> > > >   On 05/01/2010, at 20.49, Alex Giladi wrote:
>> > > > > > >   RTP payload (at RFC 3984) uses RBSP. Maybe it's worth 
>> > >  adding "rtp" as
>> > > > >   an output option?
>> > > > >   --ag
>> > > > > >   Is this true? I see no mention of RBSP in the RFC. The RFC 
>> > >  3984 payload >  is based on NAL units, which according to the H.264 
>> > >  spec consists of a >  1-byte header followed by RBSP _and_ the 
>> > >  necessary emulation bytes. >  Further, as mentioned, the H.264 
>> > >  reference encoder seems to output RTP >  packets with the emulation 
>> > >  bytes.
>> > > > >   As far as I can tell any decoder that expects raw RBSP + 1 byte 
>> > >  header >  is broken and x264 should never allow this to be generated 
>> > >  unless it can >  be proven that it's the de-facto standard.
>> > > > >   /Gil
>> > > > > > >   On Tue, Jan 5, 2010 at 2:04 PM, Jason Garrett-Glaser
>> > > > >   <darkshikari at gmail.com> wrote:
>> > > > > > > >   On Tue, Jan 5, 2010 at 1:41 PM, Philip Spencer
>> > > > > >   <pspencer at fields.utoronto.ca> wrote:
>> > > > > > > > > >   First a disclaimer: I know very little about the 
>> > >  H.264 spec, and > > > >  am just
>> > > > > > >   trying to resolve an interoperability issue we are having, 
>> > > so > > > >   please bear
>> > > > > > >   with me if I misstate or misunderstand something.
>> > > > > > > > > > >   In the routine x264_nal_encode (in 
>> > >  common/common.c), the flag > > > >  b_annexb
>> > > > > > >   controls whether or not to add a NAL start code (00 00 00 
>> > >  01) to > > > >  the
>> > > > > > >   beginning of the packet, but does not control whether or 
>> > >  not to do > > > >  the
>> > > > > > >   escaping of sequences of the form 00 00 00/1/2/3 by 
>> > >  inserting a 03 > > > >  byte into
>> > > > > > >   the third position -- that escaping is ALWAYS done, even if 
>> > > > > > >   b_annexb is not
>> > > > > > >   set.
>> > > > > > > > > > >   Is this a bug, or is it meant to be that way? (I 
>> > >  don't have access > > > >  to the
>> > > > > > >   text of Annex B).
>> > > > > > > > > >   We didn't make it an option because we didn't know of 
>> > >  any devices > > >  that
>> > > > > >   expected non-escaped bytestreams.  All containers we knew 
>> > > that > > >   didn't
>> > > > > >   want Annex-B startcodes still expected escaped NAL units.
>> > > > > > > > > > > > >   It certainly breaks interoperability with 
>> > >  several devices. In > > > >  particular,
>> > > > > > >   x264 cannot be used with the Ekiga softphone application to 
>> > > > > > >   communicate with
>> > > > > > >   the "LifeSize Room" brand of videoconferencing equipment: 
>> > > that > > > >   device
>> > > > > > >   expects non-Annex-B packets over RTP, and cannot handle the 
>> > > extra > > > >   03 byte
>> > > > > > >   that is inserted. The result is that the session parameters 
>> > >  (like > > > >  stream
>> > > > > > >   resolution and other such settings, which often contain 
>> > > multiple > > > >   zero bytes
>> > > > > > >   in a row) get completely garbled because of the Annex-B 
>> > > bytestream > > > >   encoding.
>> > > > > > > > > > >   Also, if one attempts to sniff the network traffic 
>> > >  with software > > > >  such as
>> > > > > > >   WireShark, it too chokes on the unexpected 03 bytes in RTP 
>> > > > > > >   packets.
>> > > > > > > > > > >   It would seem to me that this is a bug: if Annex B 
>> > > bytestream > > > >   encoding is
>> > > > > > >   not desired, such as for an RTP packet, then the extra 
>> > > escape > > > >   bytes should
>> > > > > > >   not be inserted.
>> > > > > > > > > > >   On our system, I have applied the patch below to 
>> > >  common.c and then
>> > > > > > >   H.264 connectivity to the LifeSize Room videoconferencing > 
>> > > > > >   equipment works
>> > > > > > >   just fine.
>> > > > > > > > > > >   On the other hand, from a quick glance at the 
>> > >  source code of the > > > >  reference
>> > > > > > >   encoder/decoder, it seems that it behaves the same way as 
>> > >  x264: > > > >  always
>> > > > > > >   inserts the escape byte. Is this a bug in the reference > > 
>> > > > >   encoder/decoder too,
>> > > > > > >   or does the text of Annex B specify that ALL H.264 streams 
>> > > should > > > >   have the
>> > > > > > >   extra bytes inserted, even when bytestream encoding is not 
>> > > being > > > >   used?
>> > > > > > > > > > >   In the latter case, then obviously the LifeSize 
>> > > brand > > > >   videoconference units
>> > > > > > >   are buggy, but since I know they interperate will over 
>> > >  H.264 with > > > >  a wide
>> > > > > > >   range of units from other manufacturers there must be a lot 
>> > > of > > > >   buggy devices
>> > > > > > >   out there -- would it be worth adding an extra flag to x264 
>> > > that > > > >   says "do
>> > > > > > >   bytestream encoding only in Annex B mode, for compatibility 
>> > >  with
>> > > > > > >   devices that cannot handle it in RTP mode"?
>> > > > > > > > > >   The best solution here is probably to add another 
>> > >  parameter to > > >  control
>> > > > > >   the escaping.  b_escaped_nals or similar.  In the case of
>> > > > > >   b_escaped_nals == 0, a straight memcpy would be used.
>> > > > > > > > >   Dark Shikari
>> > > > > >   _______________________________________________
>> > > > > >   x264-devel mailing list
>> > > > > >   x264-devel at videolan.org
>> > > > > >   http://mailman.videolan.org/listinfo/x264-devel
>> > > > > > > > > > >   _______________________________________________
>> > > > >   x264-devel mailing list
>> > > > >   x264-devel at videolan.org
>> > > > >   http://mailman.videolan.org/listinfo/x264-devel
>> > > > > >   _______________________________________________
>> > > >   x264-devel mailing list
>> > > >   x264-devel at videolan.org
>> > > >   http://mailman.videolan.org/listinfo/x264-devel
>> > > > > 
>> > >   _______________________________________________
>> > >   x264-devel mailing list
>> > >   x264-devel at videolan.org
>> > >   http://mailman.videolan.org/listinfo/x264-devel
>> > > 
>> >  _______________________________________________
>> >  x264-devel mailing list
>> >  x264-devel at videolan.org
>> >  http://mailman.videolan.org/listinfo/x264-devel
>> > 
>> >
>>  _______________________________________________
>>  x264-devel mailing list
>>  x264-devel at videolan.org
>>  http://mailman.videolan.org/listinfo/x264-devel
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org
> http://mailman.videolan.org/listinfo/x264-devel
>

--------------------------------------------+-------------------------------
Philip Spencer  pspencer at fields.utoronto.ca | Director of Computing Services
Room 336        (416)-348-9710  ext3036     | The Fields Institute for
222 College St, Toronto ON M5T 3J1 Canada   | Research in Mathematical Sciences