[RFC] New architecture for a video server
massiot at via.ecp.fr
Fri Aug 2 00:54:54 CEST 2002
The needs of streaming technologies are evolving very fast, and we've
seen many requests that we're unable to fulfill with the current VLC
and VLS architectures. The range of wanted functionalities include
MPEG-2 to MPEG-4 transcoding, MPEG-4 streaming, VOD, RTP, RTSP. The
purpose of this mail is to make propositions to move towards a
greater modularity, allowing us to support more input and output
VLS streaming architecture is that of a streamer, not of an encoder.
That is, it only deals with the MPEG system layer, without touching
the PES's. This architecture, though quite adequate for broadcasting
MPEG-2 TS, proves inadequate for the recent protocols we want to
Indeed, many of the features requested (transcoding, MPEG-4 FlexMux)
compell us to go down to the Elementary Stream level, and have a
per-ES processing. In addition, working on the ES's enables us to fix
broken streams on-the-fly (such as the common MPEG-1 sequence header
problem), and even to directly stream ES files. Consequently, I want
to push for a new server solution, based on a different architecture,
closer to an encoder's.
The basic idea is to demultiplex, process each ES separately,
remultiplex, and send.
+------+ +-------+ +---------------+ +-----+ +--------+
| read | -> | demux | -> | ES processing | -> | mux |-> | output |
+------+ +-------+ +---------------+ +-----+ +--------+
Since there is a lot of work to do, we should reuse code from
existing projects, as much as possible. Libvlc's input module
understands many more formats than VLS's, so I propose that we plug
the new server after the input thread of VLC, and take advantage of
the recent --codec switch (a multiplexer then being a special codec
type). In addition, having it inside libvlc allows us to use it from
other applications, in particular I'm thinking of an Apache module.
The multiplexing and output part can probably be based on VLS's
This approach has one drawback : when reading a TS file, the TS
stream that is sent only vaguely resembles to the packets given. That
is, everything is demultiplexed and remultiplexed, so all existing
SIs are lost. For such stream, VLS or VLMS still have an advantage.
Similarly, VLC is currently unable to decode several programs within
the same input. That is we couldn't stream more than one program from
a satellite input. Again, VLS will still be useful for that kind of
situations, until VLC's input is extended to support decoding
The following chapters describes how the new server architecture
would be integrated into libvlc. Note that this document doesn't give
any hook for "scheduling" streams, that is start and stop streams at
some precise time. IMHO, it is the job of an external program to
start and stop servers, and is thus out of the scope of this document.
The general idea is to hijack the decoders of VLC with '--codec
packetizer,none'. This will prevent the spawning of video & audio
output, and of video & audio decoders, and launch our own threads,
which will take the ES and packetize them in ES packets.
For each Elementary Stream :
+------------------+ +------------+ +------------------+
| [type] bitstream | --> | ES parsing | --> | ES Packetization |
+------------------+ +------------+ | & CR calculation |
input thread | packetizer thread
The packetizer thread is specific to the type of ES we're dealing
with. We will need a packetizer "decoder" (read : plug-in) for :
- MPEG-1/2 video
- MPEG-1/2 audio
- MPEG-4 video & audio (FlexMux)
We can also have special packetizer plug-ins which would act as
transcoders (MPEG-2 to MPEG-4 for instance), or even encoders (raw
YUV -> MPEG).
1.2 Multiplexor and output
The user will also pass '--sout udp/ts:@220.127.116.11:1234' to
configure a stream output instance, which will run in one of the
packetizer threads (à la aout3). The instance will load an access
plug-in (for physical access to the file descriptor), and a TS
multiplexor (in the future we may choose to support other streaming
This last plug-in will be responsible for merging the "packetized"
elementary streams (it's not PES yet ; the PES header hasn't been
written) coming from the packetizer "decoders", and add PCR and SI.
PCR SI (libdvbpsi)
| Video ES | --> | |
+----------+ | | +-------------------+ +----------+
| TS | --> | Raw packetization | --> | Physical |
+----------+ | MUX | | (including RTP) | | layer |
| Audio ES | --> | | +-------------------+ +----------+
optional stuffing (CBR)
<...-------> <-----> <------------------------------------>
packetizer tsmux access plug-in
running in one of the packetizer threads optional
We will have the following access plug-ins :
- udp (raw packets over UDP)
- file (can be used for PS to TS converter)
A drawback of this architecture : we have a lot of threads. 1 thread
for the input, 1 thread for the interface, 1 thread per ES, and
optionally one thread for the stream output. However, on current
machines VLS and VLMS only take up a margin of the CPU power, so that
we can afford a little more CPU time.
2. The packetizer plug-in
This plug-in takes Elementary Stream data (bitstream) from the input,
and builds ES packets with it. It scans the bitstream for logical
structures, to construct convenient packet boundaries.
For instance, the MPEG video packetizer plug-in will scan for
PICTURE_HEADER startcodes, and put one picture per packet. The packet
will be tagged with a CR date. This date defines the instant when the
last "raw" packet (read : TS) coming from this PES must be sent onto
the network, at worst. That way we ensure that the decoder has enough
time to decode the frame. Please note that this isn't the PTS (though
it is calculated from the PTS), since in MPEG the presentation order
isn't the decoding order (remember ?).
It is also in this step that the sequence header will be repeated, if
3. The global TS multiplexer
This plug-in doesn't need to run in its own thread, and will
periodically be run in a packetizer thread (when FIFOs are big
enough). It takes ES packets from the incoming FIFOs, constructs the
PES headers, splits the PES in TS's, and assigns them an emission
date, which will be less than or equal to the date tagged in the ES
packet by the packetizer.
It will periodically insert PAT/PMT packets (coming from libdvbpsi,
constituted with info gathered by the packetizer from the input), and
PCR packets (derived from the system clock and the emission dates).
Finally, if the user requested the stream to be constant bitrate
(CBR), the TS multiplexor will add TS packets for stuffing. The
resulting TS packets are placed in a FIFO, in the right order, with
4. The access plug-in
This plug-in runs in the same thread as the TS multiplexor, and takes
packets from the latter FIFO to write to the physical medium. In case
of a file output, the packets are just written one by one, and the
emission dates are not taken into account.
In case of a raw UDP output, the access plug-in creates buffers of
1316 bytes and fills them with TS packets. The buffer are then queued
until the emission date of the first TS packet. At that time, the
stream output thread (running at a real-time priority) will pick it
up and write it to the network device. This thread is optional and is
very similar to the audio output thread.
RTP output will add to this scheme an extra header before the 1316
bytes, with the emission date of the first TS packet. It will be
handled by the access plug-in layer.
Comments are welcome. From the mails we get in the mailing-list, I
feel that such features are pretty urgent. Therefore, I will start
working on this as soon as the aout3 is in the CVS (which is only a
few days away).
This is the vlc-devel mailing-list, see http://www.videolan.org/vlc/
To unsubscribe, please read http://www.videolan.org/lists.html
If you are in trouble, please contact <postmaster at videolan.org>
More information about the vlc-devel