[vlc-devel] [PATCH 1/3] window: add has_shared_egl info flag

Rémi Denis-Courmont remi at remlab.net
Fri Oct 4 21:54:05 CEST 2019


Le torstaina 3. lokakuuta 2019, 22.22.49 EEST Alexandre Janniaux a écrit :
> On Thu, Oct 03, 2019 at 08:39:27PM +0300, Rémi Denis-Courmont wrote:
> > Le torstaina 3. lokakuuta 2019, 15.35.32 EEST Alexandre Janniaux a écrit :
> > > > It's not entirely clear how this works if a legacy GL WSI is used. On
> > > > X11
> > > > it might work regardless of GLX and EGL because different Display* are
> > > > allocated, but is that guaranteed? What about other windowing systems?
> > > 
> > > Indeed, the original issue stems from the fact that you must share the
> > > same `wl_display` instance between the interface and the video. Maybe
> > > foreign surface support would solve the issue in a later release, when
> > > available.
> > 
> > I'm not convinced that we can assume that EGL matches displays by pointer
> > values. That looks like an unguaranteed implementation detail.
> 
> If you're not convinced, I invite you to read the documentation of
> eglGetPlatformDisplay.
> 
> >  eglGetPlatformDisplay obtains an EGL display connection for the specified
> > platform and native_display.
> 
> If you fail on matching native_display with a pointer type, I invite you to
> read the related platform documentations. The point giving the same return
> 
> object is:
> >  Multiple calls made to eglGetPlatformDisplay with the same parameters
> >  will return the same EGLDisplay handle.

Sure, that much is obvious. My question is about the reciprocal problem, and I 
fail to see a answer.

> It doesn't clearly tell "multiple call made to eglGetPlaformDisplay with
> different parameters gives different values" but as explained after, it
> would be very unconventional and borderline to have the same EGLDisplay for
> different display connection.

On Wayland, it would not make sense. It would in fact break the security 
promises of the protocol. But in general, or in particular for X11 ... ?

But frankly, that's not the main problem. We can assume for the time being 
that EGLDisplay are matched by display pointers and EGL attributes if you 
will. Even then, I still don't see how this patch would work.

> You can also check eglGetDisplay
> documentation which cryptically says:

> > As for eglGetPlatformDisplay, EGL considers the returned EGLDisplay as
> > belonging to the same platform as display_id. However, the set of
> > platforms to which display_id is permitted to belong, as well as the
> > actual type of display_id, are implementation-specific.
> 
> The display connection represents the resource allocator. That is what a
> display connection is for a client. The fact that X11 is transparent and
> wayland doesn't define behaviour consistency between clients, and that an
> EGLDisplay binds on a native window display connection litterally means
> that if your arguments are different, you cannot easily ensure whether they
> points to the same resource or not without relying painful hacks, and
> nobody in the open source world is writing painful hacks just for the sake
> of breaking an API for a production software.

With all due respect, that is actually naive. In fact, GLX and EGL backends on 
X11 are full of "painful hacks", and routinely bypass the display protocol and 
display display server to "talk" to the GPU device driver directly for 
semiobvious performance reasons.

> Some custom platforms might do this kind of atrocity but honestly, I don't
> care about supporting edge case. If you want to, please do so but it's out
> of the scope of this patch. This is the current expected behaviour for X11
> and Wayland.

Fair enough. We don't need to care about that for now.

But by that logic, I'll deem drivers not supporting EGL reference counting 
atrocities and not support them - because I really don't see how that would 
work.

> > > EGL unknown is basically the first case in the current implementation.
> > 
> > No. The current implementation assumes that the EGL (reference) is *known*
> > not to be shared, i.e. it is not shared and/or it is reference-counted.
> What do you expect for the unknown state ? leaking memory in the situation
> that the client doesn't use EGL ?

The only option for unknown state is to request reference counting and fail 
safe if it's not supported. Indeed, you cannot leak memory by terminating EGL, 
nor terminate EGL behind some other component's back.

In fact, I don't think that there is a known shared case. In some cases, you 
may know that EGL is used on the display, but you cannot generally know if the 
attributes match those that VLC uses, so you don't know if VLC's EGL display 
is shared with the hypothetical preexisting EGL display.

There's only known not shared (what is currently implemented) and unknown.

> > I don't follow how that determines whether Qt uses EGL or not, TBH.
> > 
> > If creating a reference-counted EGL fails, then we still need to know if
> > EGL is shared. In other words, we still need to know if it is used by the
> > window provider (e.g. Qt GUI), by another EGL plugin instance for a
> > different surface of the same display (e.g. other video track), or
> > whatever else.
> 
> You can get the context, query the display, and check whether it is
> reference counted.

What's the point? You already know the result of that query: it's either what 
was requested in the EGL_TRACK_REFERENCES_KHR attribute to 
eglGetPlatformDisplay(), or the platform default without the attribute.

> There are litteraly one function for each of those
> action, and a failure at any level means that Qt is not using EGL at all.

If the query fails, it means that the display does not support reference 
tracking negotiation and is using the platform default, i.e. EGL_BAD_ATTRIBUTE 
error. It says exactly nothing about what Qt is or is not doing.

But again, I don't see the point of querying at all to begin with - you 
already know the tracking regime by the point where you can query the 
attribute, as noted above.

> A negative result on the last action means it is using an EGL implementation
> which is not using the tracking attribute. Then, as explained quite clearly
> in the display_reference document from Khronos, it depends whether you're
> on Android or not.

Yes, and thus you can infer that the EGL display is reference-counted or not. 
But if it is not reference-counted you still have no clue if somebody else is 
using it or not, or will be using it by the time you are done using it.

In particular, it could be that the application providing the WL display uses 
EGL, but with incompatible attributes, so that EGL display is *not* shared. 
Worse yet, it could be that a later second video track or visualisation starts 
using EGL with the same attributes and ends up sharing the EGL display.

> > > For the LibVLC case, yes it's an issue and was the original reason
> > > of the usage of vlc_variable. But as you rightfully pointed out in
> > > #16106, this will be trivially fixed once the window provider callbacks
> > > are available.
> > 
> > I never pointed that out and, as a matter of facts, I don't know how it
> > helps.
> > 
> > I just pointed out that I am waiting for the player window allocation
> > callback before I start writing the Wayland embedded window provider and
> > its LibVLC frontend. And that's really only because I don't want to write
> > throw-away code in the mean time.
> 
> Well, that's exactly the same issue, you have to provide the same resource
> allocator, so you can provide this kind of information too.

The LibVLC allocator cannot provide this information. That would be utterly 
insane. On one hand, in many cases, the LibVLC calling code will not know if 
EGL is used, while in most other cases, it will not know if the EGL attributes 
match those that VLC uses. And on the other hand, if the LibVLC calling code 
knows that EGL is not used, it cannot ensure that VLC or some third party 
component won't get the same EGL display later.

The only way that works is if either the EGL plugin (first option), or the 
Wayland embedded window plugin (second option) checks for reference counting 
support. And it's far easier and cleaner to do it in the EGL plugin, as the 
Wayland embedded window plugin has no other particular reasons to tough the 
EGL API.

> > > And the former are exactly why I'm trying to provide
> > > this patch, so as to provide a first video embedding PoC using Wayland,
> > > which is already working.
> > > 
> > > As this issue would happen with usage that are not currently supported,
> > > meaning mainly wayland integration in LibVLC for the short term, I don't
> > > think this is a blocking issue.
> > 
> > Well... does this not affect multiple non-embedded video tracks too?
> > That's
> > already supposed to be supported because I'd always assumed that EGL was
> > reference-counted. But now it seems that that was wrong.
> 
> It's not, because of how eglGetPlatformDisplay works. You have one display
> connection per output (=per vout thread).

Sure. I assumed that GetDisplay created a display. That's clearly wrong, but 
it works fine if reference counting is used instead (thanks to either the 
Android default or the reference counting extension). And I cannot see any 
proposal how this would work without reference counting - except for the 
existing trivial case of the XDG shell plugin.

> I also have other cases in mind which will actually need this reference
> counting, and wrote the needed addition to add internal refcounting for EGL
> vlc_gl provider, but as there is currently no need for this, it's sleeping
> on a branch.

Reference-counting a VLC GL provider is utterly pointless. That's very much 
not supported on purpose, and it should not be.

You cannot make up for the fact that the underlying EGL does not support 
reference counting (if so), because you cannot force non-VLC code to use the 
VLC GL provider. Sharing a reference on a reference-counted EGL is just 
useless over-engineering, and sharing a reference on a non-reference-counted 
EGL simply does not work.

That was the whole point for NVIDIA to write the reference counting extension: 
it cannot be worked around in upper layers.

> > > > With that, there are three options, make it the responsibility of the
> > > > GL
> > > > provider, or the window provider, or go with a tristate. In the
> > > > earlier
> > > > case, it means refuse to use EGL on affected windowing systems unless
> > > > reference counting is available, i.e. LibVLC playing it safe.
> > > > 
> > > > In the second case, it means the window provider has to ensure
> > > > reference
> > > > counting is available, otherwise fail safe, i.e. make it someone
> > > > else's,
> > > > LibVLC app developers' problem (not safe IMO).
> > > > 
> > > > In the third case, we essentially whitelist window providers with
> > > > which
> > > > EGL can be used without reference counting. But we don't fix the ugly
> > > > entanglement between provider and consumer.
> > 
> > That third option does not make sense if there are multiple video tracks,
> > so I'll pretend I never iuntroduced it from now on.
> 
> As above, eglGetPlatformDisplay doesn't work like this and we currently
> don't support harmful use cases.

Sure. That's why the only options are the first and the second. Even if you 
stand neither of them.

-- 
Rémi Denis-Courmont
http://www.remlab.net/





More information about the vlc-devel mailing list