[vlc-devel] Software decoding in Hardware buffers
jc at kynesim.co.uk
Thu Aug 8 16:10:40 CEST 2019
>On 2019-08-08 14:55, John Cox wrote:
>>> I'm looking at the display pool in the MMAL (Raspberry Pi) code and it
>>> seems that we currently decode in "hardware" buffers all the time.
>>> Either the opaque decoder output, or when the decoder outputs I420.
>>> In push we don't to use this pool anymore. The decoder will have its own
>>> pool and the display just deals with what it receives. In most cases
>>> that means copy from CPU memory to GPU memory. This doesn't work with a
>>> SoC like on the Raspberry Pi where the memory is the same and can be
>>> used directly from both sides.
>>> The idea was that current decoders continue to use decoder_NewPicture()
>>> as they used to. The pictures will come from the decoder video context,
>>> if there's one (hardware decoding) or from picture_NewFromFormat() if
>>> there's none. That means for MMAL we would need to copy this CPU
>>> allocated memory to the "port allocated" memory (the mechanism to get
>>> buffers from the display). Given the limited resources that's something
>>> we should avoid.
>>> I think we should have a third way to provide pictures: from the decoder
>>> device. In case of software decoding there is no video context, but
>>> there is a decoder device (A MMAL one in this case).
>>> So I suggests the decoders (and filters) get their output picture from:
>>> - the video context if there is one
>>> - the decoder device if there is one and it has an allocator
>>> - picture_NewFromFormat() otherwise
>>> Any opinion ?
>> I'm sure I should have an opinion (though I'm not quite sure what it is)
>> as I've get a substantial rewrite of the entire mmal/Pi modules here,
>> which I intended to upstream when it was a bit closer to finished (it is
>> currently in use as the default shipped with Pis but there is still work
>> to be done before I want it set in anything like stone).
>Oh ! Then ours is never used unless people build it from scratch
>themselves ? If it's a complete rewrite I guess we can ditch the old code ?
Yes, you can ditch the old code. My code replaces it. It is based, in
part, on the old code but it is substantially new (by now).
>In any case since that's the one currently used by most people it would
>be good to merge even if not perfect. The one we have in 4.0 won't even
>compile as it is given all the changes in 4.0.
>> There are a number of buffer types that I now pass around - all
>> currently declared as h/w though some have a plausible existance in CPU
>> memory (though not all). All end up as having their actual allocation
>> done at source (decoder or filter) though the picture_t is allocated by
>> the "display"
>That's how it's supposed to work in 3.0. In 4.0 things will be radically
>different. MMAL is the last part I'm looking at for this redesign. It's
>a good example of SoC use of VLC, compared to the other more desktop
>with GPU oriented display/decoders.
>Since you seem to know a bit more about MMAL than me, is it possible to
>allocate memory in heap and then wrap it inside a MMAL_BUFFER_HEADER_T ?
>It doesn't look like from what I see in mmal_buffer.h . If not we
>have the problem described in my original post (to avoid a copy we
>software decoders to be able to use this hardware buffers directly).
As with many things MMAL the answer is both yes & no. The easy way of
writing the code is to let MMAL deal with the allocs - but if you like
pain you can do it all yourself (I appear to like pain a lot).
You can have ARM-side allocated buffers, but if you do then they will be
copied into contiguous "GPU-side" buffers by MMAL before use by the
hardware so you don't save anything. Also note that I420 buffers
require to be in a single contiguous chunk rather than 3 separate
It is possible to allocate buffers that can be used by the ARM & the GPU
without copy and I'm working on making that work with VLCs avcodec right
now (needed for Pi4 H265 decode), though at this precise point in time I
have issues with getting an ARM-cachable lump of memory, which somewhat
impacts decode speed :-(
For my info (I haven't looked at 4.x yet) how do you deal with:
decoder -> filter -> filter -> display in the new world? In particular
what happens if the filter wants a "h/w" buffer as input (MMAL has
deinterlace and rescale filters that want this)?
>vlc-devel mailing list
>To unsubscribe or modify your subscription options:
More information about the vlc-devel