[x264-devel] Re: 264 transport

Fri Feb 2 00:08:07 CET 2007

On Thu, 2007-02-01 at 13:33 -0800, Bill May wrote:
> Alex Izvorski wrote:
> > I think the recommended way of packing NALs into PES packets is to
> > concatenate all slice NALs for one picture, in order, together with any
> > immediately preceding SPS/PPS/AUD/SEI NALs, and put those in a single
> > PES packet with data_alignment_indicator=1 and PES_packet_length=0 and
> > with PTS/DTS corresponding to that picture.  This will work for MPEG2
> > transport streams but not for program streams because then there would
> > be no good way to find the beginning of next packet; in that case, split
> > the data between as many 64k-length or smaller PES packets as needed,
> > all with the same PTS/DTS, but of which only the first one has
> > data_alignment_indicator=1.  This is simple, low overhead, and
> > relatively standards-compliant.
> 
> 
> Doesn't the specification require that an access unit delimeter NAL be
> the first NAL for a picture ? (ISO/IEC 138181-1:2000 final draft, amendment 3,
> page 18).  This AUD NAL determines the next picture start.

Bill - you are correct, amendment 3 (section 2.14.1, it's on page 17 of
mine) does make access unit delimiters mandatory for H.264 carried in a
MPEG2 transport stream.  I hadn't noticed that, as they are normally
optional.  Thanks for pointing it out.

However, that does not solve the question of how to pack NALs in PES
packets in a program stream.  The dilemma is that if the PES packets
have no defined length, then you have no way of knowing where they end,
since they are merely concatenated and not in turn carried in transport
stream packets which by themselves would allow determining both the
start and length of PES packets, via the payload_unit_start_indicator=1
and adaptation field stuffing bytes.

Access unit delimiters don't really help here: they would help us find
out which slice NALs together form a picture (without decoding the slice
headers and counting macroblocks), but we are trying to find where the
PES packet ends, before we can even read any NALs!

Let's try and run through the logic of that: suppose we have a program
stream containing unbounded PES packets with NALs packed in them.  We
are reading somewhere from the middle of that stream.  A PES packet
starts with a packet_start_code_prefix = 0x000001, followed by stream_id
and PES_packet_length; a NAL starts with start_code_prefix_one_3bytes =
0x000001 (and we know it cannot contain any such sequence in the middle
due to SODB-to-RBSP translation), followed by forbidden_zero_bit,
nal_ref_idc and nal_unit_type.    Therefore any occurence of 0x000001
(which is not already in a PES header!) is either the start of a NAL or
the start of a new PES packet, but how do you tell which one?
Surprisingly, you can distinguish the two because all valid stream_id
start with a 1 bit, whereas a NAL will always have a 0 bit immediately
after the prefix.  That is somewhat tenuous, and I am not sure if any
decoders actually do that, or whether the writers of the standard had it
in mind, but yes, it does seem to be possible.  Does that mean that
unbounded PES packets should always be used to carry H.264?

Regards,
--Alex Izvorski

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html