[vlc-devel] [RFC 0/1] Let decoders decide over DPB size.

Fri Aug 23 18:28:13 CEST 2013

	Hello,

Le vendredi 23 août 2013 17:38:45 Julian Scheel a écrit :
> while working at the direct rendering code in the omxil module I stumbled
> into the following problem: For direct rendering the video output core
> required a given amount of frames in the picture pool, which is calculated
> as:
> 
> private_picture = 4;
> decoder_picture = 1 + sys->dpb_size;
> kept_picture = 1;
> reserved_picture = DISPLAY_PICTURE_COUNT + private_picture + kept_picture;
> toal_size = reserved_picture + decoder_picture;
> 
> This is taken from src/video_output/vout_wrapper.c and modified slightly for
> readability. Now this seems to make a lot of sense, when the decoder
> running is actually utilizing the picture pool to store it's dpb. In fact
> this seems to be the case for one decoder only, which is libmpeg2, as of
> now.

Eh? I think any decoder that supports direct rendering uses the picture pool 
for DPB. In fact, I think it is unavoidable.

> All other decoders care about a required dpb internally and hidden
> from VLC.

Wouldn't that imply that the decoders must copy reference pictures into VLC 
buffers?

> So in the end the pictures for dpb will be allocated in the pool
> and a second time in the decoder itself.

That should only happen if the decoder is too stupid to decode into custom 
(VLC video output) buffers.

> While this is probably not much of
> a problem on high end systems it is a problem on embedded systems.

> Taking into account that the dpb_size for H264 video is set to 18 in
> src/input/decoder.c you will have to provide a picture pool with at least 24
> frames to allow direct rendering.

That is how it is supposed to work. The decoder requires 18 pictures(in that 
case) and VLC requires a few extra for buffering in filters and output.

> Doing this on a embedded system with few
> memory being available this is likely to fail for high resolution videos.

I disagree. If you cannot allocate as many and as large buffers as the decoder 
requires, then fundamentally, you cannot decode the video. Period.

If you want to allocate the memory from the GPU, you can do that in the video 
output. Since the VDPAU patchset, you can also do that from the decoder, on 
condition that the memory is not mapped on the CPU.

> While working with the omx modules I ran into this problem on Tegra 2 as
> well as Raspberry Pi platforms, because both did not have enough memory to
> store 24 or more full 1080p frames in the GPU memory. But as they do not
> require the dpb to be stored in the picture pool, but deal with it
> internally it is in fact possible to remove the dpb_size form the picture
> pool and run with a much smaller picture pool without any issues.

I believe that is a problem within the OMX decoder. It would seem to perform 
indirect rendering. This is slow, and indeed wasteful of memory space, 
especially on low-end systems.

> So to address this issue I propose the attached patch which shifts the
> resposibility for announcing the required dpb_size to the decoder modules. I
> have not yet tested all decoders with this patch applied, but all I tested
> (libmpeg2, avcoded for mpeg2 and h264, omxil) seemed fine.
> 
> Does anyone else see issues with this approach?

I don't mind specifying the DPB count in the decoders, as it would be 
architecturally cleaner than doing so in the core. But I dislike the notion 
that it would be used to promote indirect rendering over direct rendering.

Regardless, your patch seems incomplete / under-implemented.

-- 
Rémi Denis-Courmont
http://www.remlab.net/