[vlc-devel] Software decoding in Hardware buffers

Mon Aug 12 13:32:13 CEST 2019

Hi,

Your proposal does break push completely. It literally requires all software decoder and filter to pull buffers from the device. The whole point of push is intrinsically that the upstream decides how it allocates its buffers: you cannot require decoders or filters to allocate in any specific one way.

Le 12 août 2019 12:09:36 GMT+03:00, Steve Lhomme <robux4 at ycbcr.xyz> a écrit :
>On 2019-08-12 10:15, Alexandre Janniaux wrote:
>> Hi,
>> 
>>> No, it is not my opinion. It is what was agreed collectively. Unlike
>your opinion, which engages only you.
>>>
>>> I am very fed up with people misconstruing earlier decisions as my
>opinion. You can not have it both ways.
>> 
>> Sorry, even if you are right, for my part I fail to agree being
>included in
>> this kind of message whereas what we decided hasn't been summarized
>somewhere.
>> You can't point design issue on non-existant design, so it doesn't
>seem right
>> to consider someone else work as being "misconstruing" the decision
>we took.
>> 
>> I took notes for myself and tried later to summarize them, after the
>first
>> questions about what we decided were raised on this mailing list, but
>I don't
>> have enough background to finish them without being biased by my own
>ideas
>> on the implementation behind push model. Even the design and naming
>itself has
>> evolved since the second vout workshop. Maybe it would help if I
>publish them
>> as draft, with the scan of the notes, so that it can be completed
>somewhere
>> publicly ?
>
>IMO we should have a formal summary of what was decided during 
>workshops. It should go into details so there's no confusion. Subjects 
>that are left to be decided should also be mentioned. It would also
>give 
>an idea for people not in the workshop of where things are headed, what
>
>is going to change.
>
>This is not the first time we disagree in directions after we had a 
>workshop. Implementing things always lead to some corner cases that we 
>may have not seen during the meeting.
>
>> I don't see any issues with trying to include more cases, even if I
>would
>> prefer having the first layers first before starting to experiment
>with
>> the push architecture.
>> 
>> Thank you for the constructing argument on push for the SoC case
>though.
>> IMHO they are good arguments to show that it should work on SoC and
>that
>> delaying some decoder full support with an extra copy is acceptable.
>> 
>> On Mon, Aug 12, 2019 at 08:57:25AM +0200, Steve Lhomme wrote:
>>> On 2019-08-11 13:45, Rémi Denis-Courmont wrote:
>>>> Hi,
>>>>
>>>> Indeed we did agree that some old and crappy API's may require
>memory copying if they have no sane ways to allocate and reference
>picture buffers, including but not limited to old OpenGL versions.
>> 
>> OpenGL isn't really handling native resource management and
>everything
>> related to that but pixel buffer and CPU upload is made by the layer
>> below (EGL, GLX, ANGLE). It might not be the best example when it
>comes
>> to pictures. However, I agree that additional copies on edge cases
>should
>> not prevent us from shifting to a push model.
>
>Nothing is preventing the push model we designed. It's only a matter of
>
>not losing performance because of it.
>
>>>
>>> Yes we said that for OpenGL there were cases where some copies could
>be
>>> needed. But there was already a copy from CPU to GPU in that case,
>be it in
>>> our code or in the driver. (we're talking about software decoding)
>So it
>>> doesn't really matter who does it.
>>>
>>> In the case of MMAL (or a SoC architecture in general) the case
>wasn't
>>> raised AFAIK. Here we introduce an extra copy that didn't exist
>before.
>>>
>>> In the case of OpenGL that's on rare/old OpenGL implementations that
>we
>>> handle the copy ourself. In the case of MMAL that's the default
>>> implementation for everyone.
>>>
>> 
>> Maybe the optimization can be tackled after the design has been set
>up on
>> the whole architecture. I haven't checked but I believe it could be
>> replaced (now or later) by the more or less standard GBM + v4l2 layer
>> like on Linux instead of relying on mmal API, which would make the
>> raspberry push-friendly.
>
> From what I saw the MMAL vout is really a standalone one and not 
>related to that. I suppose MMAL decoding should also work in OpenGL but
>
>that doesn't seem implemented. Maybe it is in John Cox's branch. It may
>
>also be worth investivating this as if there's PBO then we're back to 
>the regualar case where we don't need a copy. And that's likely the way
>
>we want to go forward. If an inferior display module (current MMAL) is 
>used then we can live with some drawback.
>
>> GBM is the graphics allocation layer used everywhere but NVidia
>(which has
>> it's own EGLStream which works more or less the same, for a change).
>You get
>> an object which can be turned into a file descriptor, and be imported
>in
>> either graphics API like vulkan or EGL, or windowing API like X11 or
>Wayland.
>> It's a very simple API so it's quite future-proof even if it will
>eventually
>> be replaced again.
>> 
>> I don't know what's available for BSD-like system or other exotic
>systems
>> but I guess the first challenge would be GPU support even before
>supporting
>> the push model for most SBC available on the market, and as they are
>not
>> officially supported for now, we might avoid considering them in the
>design
>> as well.
>> 
>> Greats,
>> --
>> Alexandre Janniaux
>> VideoLabs
>> 
>>>
>>>> Le 9 août 2019 16:11:55 GMT+03:00, Steve Lhomme <robux4 at ycbcr.xyz>
>a écrit :
>>>>> Did we agree that MMAL will get extra copies due to our design
>>>>> decisions ?
>>>>>
>>>>> On 2019-08-09 13:55, Rémi Denis-Courmont wrote:
>>>>>> No, it is not my opinion. It is what was agreed collectively.
>Unlike
>>>>> your opinion, which engages only you.
>>>>>>
>>>>>> I am very fed up with people misconstruing earlier decisions as
>my
>>>>> opinion. You can not have it both ways.
>>>>>>
>>>>>> Le 9 août 2019 12:57:53 GMT+03:00, Steve Lhomme
><robux4 at ycbcr.xyz> a
>>>>> écrit :
>>>>>>> On 2019-08-09 10:16, Rémi Denis-Courmont wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> The MMAL plugins are unmaintained. By definition, if the
>>>>>>> implementation that actually sees users and updates is another
>one,
>>>>>>> then ours is unmaintained.
>>>>>>>>
>>>>>>>> And the point is that I don't want to change the core for a
>>>>>>> misdesigned plugin. I have not seen any technically valid
>>>>> justification
>>>>>>> for adding yet another way to allocate pictures, nor how this
>would
>>>>>>> work.
>>>>>>>>
>>>>>>>> You cannot expect software decoders and filters to allocate
>>>>> pictures
>>>>>> >from decoder device or video context. That's complete denial of
>>>>>>> everything that was agreed upon, and reintroduces a whole lot of
>>>>>>> problems that push was supposed to fix.
>>>>>>>
>>>>>>> That's your opinion and I don't share it. We made some design
>>>>> choices
>>>>>>> but until they were implemented we had no idea if all use cases
>were
>>>>>>> covered. And it turns out not all use cases are covered. With
>MMAL
>>>>> (be
>>>>>>> it the old module or the new module) the design we have is not
>good
>>>>>>> enough. We are forcing copies where they didn't exist before.
>>>>>>>
>>>>>>> And what I propose is still push design. It's still the decoder
>that
>>>>>>> creates a video context and pushes it forward. It just may not
>be
>>>>> aware
>>>>>>>
>>>>>>> it's using it at all.
>>>>>>>
>>>>>>> And as I experienced with this idea, for D3D11 it would make
>perfect
>>>>>>> sense to allow the decoder device be the creator of video
>context.
>>>>> They
>>>>>>>
>>>>>>> are highly related and one doesn't really exist without the
>other.
>>>>>>>
>>>>>>>> Le 9 août 2019 08:50:43 GMT+03:00, Steve Lhomme
><robux4 at ycbcr.xyz>
>>>>> a
>>>>>>> écrit :
>>>>>>>>> On 2019-08-08 18:27, Rémi Denis-Courmont wrote:
>>>>>>>>>> Le torstaina 8. elokuuta 2019, 15.29.30 EEST Steve Lhomme a
>écrit
>>>>> :
>>>>>>>>>>> Any opinion ?
>>>>>>>>>>
>>>>>>>>>> I don't see why we should mess the architecture for a
>>>>>>>>> hardware-specific
>>>>>>>>>> implementation-specific unmaintained module.
>>>>>>>>>
>>>>>>>>> It's not unmaintained, I was planning to revive it to make
>sure
>>>>> that
>>>>>>>>> the
>>>>>>>>> default player on Raspberry Pi remains VLC when we release
>4.0. It
>>>>>>>>> seems
>>>>>>>>> there's a different implementation so I'll adapt that one.
>>>>>>>>>
>>>>>>>>> One reason for that is to make sure our new push architecture
>is
>>>>>>> sound
>>>>>>>>> and can adapt to many use cases. Supporting SoC architectures
>>>>> should
>>>>>>>>> still be possible with the new architecture. Allocating all
>>>>> buffers
>>>>>>>>> once
>>>>>>>>> in the display was making this easy and efficient (in terms of
>>>>> copy,
>>>>>>>>> not
>>>>>>>>> memory usage). We should aim for the same level of efficiency.
>>>>>>>>>
>>>>>>>>> Also let me remind you the VLC motto: "VLC plays everything
>and
>>>>> runs
>>>>>>>>> everywhere".
>>>>>>>>>
>>>>>>>>>> Even when the GPU uses the same RAM as the CPU, it typically
>uses
>>>>>>>>> different
>>>>>>>>>> pixel format, tile format and/or memory coherency protocol,
>or it
>>>>>>>>> might simply
>>>>>>>>>> not have a suitable IOMMU. As such, VLC cannot render
>directly in
>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> And if it could, then by definition, it implies that the
>decoder
>>>>>>> and
>>>>>>>>> filters can
>>>>>>>>>> allocate and *reference* picture buffers as they see fit,
>>>>>>> regardless
>>>>>>>>> of the
>>>>>>>>>> hardware. Which means the software on CPU side is doing the
>>>>>>>>> allocation. If so,
>>>>>>>>>> then there are no good technical reasons why push cannot work
>-
>>>>>>>>> misdesigning
>>>>>>>>>> the display plugin is not a good reason.
>>>>>>>>>
>>>>>>>>> I haven't proposed any design change to the display plugin,
>other
>>>>>>> than
>>>>>>>>> already discussed. What I proposed is a way to allocate CPU
>>>>> pictures
>>>>>>>> >from the GPU. My current solution involves creating a video
>>>>> context
>>>>>>>>> optionally when the decoder doesn't provide one.
>>>>>>>>>
>>>>>>>>> It could even be used on desktop. For example on Intel
>platform
>>>>> it's
>>>>>>>>> possible to do it without much performance penalty. I used to
>do
>>>>> it
>>>>>>> in
>>>>>>>>> D3D11 until I realized it sucked for separate GPU memory. But
>I
>>>>> had
>>>>>>> no
>>>>>>>>> way to know exactly the impact of the switch because the code
>was
>>>>>>> quite
>>>>>>>>>
>>>>>>>>> different. Now it might be possible to tell. I have a feeling
>on
>>>>>>> Intel
>>>>>>>>> it may actually be better to decode in "GPU" buffers directly.
>The
>>>>>>>>> driver can take shortcuts that we can't. It may do the copy
>more
>>>>>>>>> efficiently if it needs one (or maybe it doesn't need one). It
>can
>>>>>>> do
>>>>>>>>> the copy asynchronously (as every command sent to a
>>>>>>>>> ID3D11DeviceContext)
>>>>>>>>> as long as it's ready when it needs to be displayed.
>>>>>>>>> _______________________________________________
>>>>>>>>> vlc-devel mailing list
>>>>>>>>> To unsubscribe or modify your subscription options:
>>>>>>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>>>>>
>>>>>>>> --
>>>>>>>> Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez
>>>>>>> excuser ma brièveté.
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> vlc-devel mailing list
>>>>>>>> To unsubscribe or modify your subscription options:
>>>>>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> vlc-devel mailing list
>>>>>>> To unsubscribe or modify your subscription options:
>>>>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>>>
>>>>>> --
>>>>>> Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez
>>>>> excuser ma brièveté.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> vlc-devel mailing list
>>>>>> To unsubscribe or modify your subscription options:
>>>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>>>
>>>>> _______________________________________________
>>>>> vlc-devel mailing list
>>>>> To unsubscribe or modify your subscription options:
>>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>
>>>> --
>>>> Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez
>excuser ma brièveté.
>>>>
>>>>
>>>> _______________________________________________
>>>> vlc-devel mailing list
>>>> To unsubscribe or modify your subscription options:
>>>> https://mailman.videolan.org/listinfo/vlc-devel
>>>>
>>> _______________________________________________
>>> vlc-devel mailing list
>>> To unsubscribe or modify your subscription options:
>>> https://mailman.videolan.org/listinfo/vlc-devel
>> _______________________________________________
>> vlc-devel mailing list
>> To unsubscribe or modify your subscription options:
>> https://mailman.videolan.org/listinfo/vlc-devel
>> 
>_______________________________________________
>vlc-devel mailing list
>To unsubscribe or modify your subscription options:
>https://mailman.videolan.org/listinfo/vlc-devel

-- 
Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20190812/121ccc19/attachment.html>