[vlc-devel] [PATCH] [RFC] decoder: add more decoder device types
Steve Lhomme
robux4 at ycbcr.xyz
Mon Jun 24 16:10:14 CEST 2019
On 2019-06-24 11:37, Alexandre Janniaux wrote:
> Hi,
>
> For Vulkan, here are the related resource for memory import:
> + https://github.com/KhronosGroup/Vulkan-Docs/blob/master/appendices/VK_KHR_external_memory.txt
> + https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap10.html
>
> The second link contains the whole API needed to import or export
> handles. It depends on the driver but it can have support for D3D11
> texture, D3D12 resources or heap, DXGI resources, opaque, etc.
> The whole list is there:
> + https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap38.html#VkExternalMemoryHandleTypeFlagBits
>
> You can also give security descriptors, but I don't know if we can
> really trust them for everything like we could trust linux sealing
> mechanism.
>
>
> About the PBO, they just provide a different rendering context from
> the normal pipeline, thus eliminating some automatic dependencies
> and allowing to parallelize upload. You can have more than on if you
> want to upload more than one texture but I don't think it's useful
> except when having multiple renderer display.
>
> As a side note, our current use of mutiple renderer display, except
> multiple video tracks which is qui rare, is the splitter, for which
> you currently need to upload four time the same texture. If I'm not
> wrong, having the decoder device like you did there would allow
> uploading the picture in a filter before the vout display prepare
> for all rendering API ?
The display modules can already force their i_chroma to an opaque one.
The core will add a filter and will use the context found a the display
pool picture. In push it will work a bit differently but it will be
possible as well.
The splitter is seen as a single display module from the core so it
could do that. But That will be copied into 1 texture, not plenty. Push
will not change that.
> There are edgecases with non-homogenous vout in splitter but it would
> be awesome for this and filter composition later !
>
> I like the chroma approach to have transformations like
> CODEC_RGBA -> CODEC_GL_RGBA but I don't know if it fits really well
> as we'll probably need to have more than 4 letters to correctly
> express the components without messing everything. Vaapi doesn't
I already use VLC_CODEC_D3D11_OPAQUE_RGBA and
VLC_CODEC_D3D11_OPAQUE_BGRA for things like that.
> have the pixel format when transfering so it would be another
> solution, but it's cheating because it only outputs one format.
> We quickly discussed this during workshop and I think we refused
> it or that it was only a {cpu/gpu} choice, but I still have the
> feeling that adding a new public field to split memory layout and
> memory location (chroma and API type) would be better, even if it
> opens the door to bugs because of mischecks or partial checks.
Yes, at some point it would be nice to have something more generic.
> --
> Alexandre Janniaux
> VideoLabs
>
> On Mon, Jun 24, 2019 at 10:37:31AM +0200, Steve Lhomme wrote:
>> On 2019-06-24 10:32, Thomas Guillem wrote:
>>>
>>>
>>> On Mon, Jun 24, 2019, at 10:26, Steve Lhomme wrote:
>>>> On 2019-06-24 10:00, Thomas Guillem wrote:
>>>>>
>>>>> On Mon, Jun 24, 2019, at 07:32, Steve Lhomme wrote:
>>>>>> These are types needed to move the decoder pools that exist in various display
>>>>>> modules closer to the decoder.
>>>>>>
>>>>>> That means we need to make the decision early on which flavor of OpenGL
>>>>>> will be
>>>>>> used before the OpenGL display module is created. Depending on the
>>>>>> platform
>>>>>> used it could be VAAPI, VDPAU, DXVA2, OpenGL PBO (persistent and
>>>>>> non-persistent).
>>>>>>
>>>>>> And to do that we need to create the "decoder device" using the es_format_t
>>>>>> that will be decoded (so we can pick a hardware decoder or a software one),
>>>>>> which contains the codec, profile and dimensions. When any of these change we
>>>>>> will need to create a new decoder device (or check the one we have is
>>>>>> compatible).
>>>>>> ---
>>>>>> include/vlc_codec.h | 9 +++++++++
>>>>>> 1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/include/vlc_codec.h b/include/vlc_codec.h
>>>>>> index 71f3e8439f..ec5d7e32af 100644
>>>>>> --- a/include/vlc_codec.h
>>>>>> +++ b/include/vlc_codec.h
>>>>>> @@ -491,6 +491,9 @@ enum vlc_decoder_device_type
>>>>>> VLC_DECODER_DEVICE_DXVA2,
>>>>>> VLC_DECODER_DEVICE_D3D11VA,
>>>>>> VLC_DECODER_DEVICE_AWINDOW,
>>>>>> + VLC_DECODER_DEVICE_VULKAN,
>>>>>> + VLC_DECODER_DEVICE_PBO,
>>>>>
>>>>> For software rendering, VK and GL can be merged. Both will need memory allocated with vlc_memfd(). That way, we could pass the fd to the VK/GL API.
>>>>
>>>> Are you sure this can work when using Vulkan on Windows ?
>>>
>>> I guess but I never tested it:
>>>
>>> VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR
>>> VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR
>>>
>>>>> Decoders don't need to know about PBO. Indeed, PBO buffers (in non persistent) are just used as buffer to upload a picture without locking the whole GL context with glTexSubImage2D().
>>>>
>>>> OK, so only 1 is necessary. This mode should already be updated to use
>>>> the "outside" CPU pool and upload the picture during Prepare.
>>>
>>> I don't see why one is necessary. Just pass basic CPU pictures to OpenGL and Vulkan for now.
>>> Then, GL could upload it using a PBO intermediate buffer. And vulkan could pass the FD directly (and fallback to memcpy it there is no FDs).
>>>
>>>>
>>>>> And for persistent mode, I think we can drop it. Yes, from VLC point of view, there is one less memcpy, but I guess it is still done in the GL/Driver side since persistent buffers are mapped on the CPU. Mapping buffers in GPU is worst by the way, since it is very slow to access them (for reference frames for example).
>>>>
>>>> IMO in persistent mode the driver uses a backend copy. The visible on is
>>>> always the CPU one, but when it's needed in the GPU for rendering, it's
>>>> transfered there. At least on integrated GPUs it's the same memory so
>>>> it's not an issue, the driver doesn't need to do any trick and we get
>>>> good perf.
>>>>
>>>> I don't feel comfortable dropping a zero copy mode unless we know it's
>>>> actually slower (no driver tricks).
>>>
>>> That is what we decided with Niklas, RĂ©mi and Alexandre during the last vout workshop.
>>
>> OK. These changes need to be done before the code is moved for push then.
>> _______________________________________________
>> vlc-devel mailing list
>> To unsubscribe or modify your subscription options:
>> https://mailman.videolan.org/listinfo/vlc-devel
> _______________________________________________
> vlc-devel mailing list
> To unsubscribe or modify your subscription options:
> https://mailman.videolan.org/listinfo/vlc-devel
>
More information about the vlc-devel
mailing list