[vlc-devel] [RFC] video filters, hw acceleration and 10bits

Jean-Baptiste Kempf jb at videolan.org
Fri May 19 12:37:20 CEST 2017

Hello everyone,

1/ Hardware decoders

For the VLC 3.0 release, we want to enable hardware decoders with 0 CPU
copy by default on all platforms: Linux (vaapi/vdpau), Windows
macOS/iOS (videotoolbox) and Android (mediacodec).

Indeed, we should publish 3.0 with this feature since there are more and
contents (4K, HEVC, 10bits, >= 60fps) that need a hardware decoder to
be played correctly. Notably because newer laptop and ultrabooks have
very small CPU but good GPU decoders.

In order to activate those hardware decoders, we need to have at least:
 - hw decoding + display with scaling + subtitles blending in GPU,
 - hw deinterlacing,
 - snapshot,
 - hw adjust filter, aka gamma correction.

All of theses features are done or under work for all the platform we
care about.

2/ The issue: CPU filters

The issues we have is CPU filters: those do not work with hw decoding,
and we can't port all of them to GPU in a timely fashion (some might 
be even impossible to port).

At the same time, we're seeing exactly more and more content with
non-YUV 8bits chromas, notably with HEVC, and very few filters work
in those cases too.

Users expect VLC filters to work and don't understand why some filters
work and why some other (when 10bits/RGB or when using hw) don't.

3) Solutions

There are a few solutions, but not all of them have the same User

a) It's very difficult to restart the decoder and the vout when
a CPU filter is selected: glitches because a different vout will be
used, we loose input and output frames, we'll need to wait for an I
frame, we're not sure to be able to go back to the same state/position.
We have already this difficulty with VT and Mediacodec when we restart
And this also does not solve the non-8bits-chroma issue.

b) Another solution, the most seamless we can do, is to copy back the
video buffer to the CPU, filter it, and copy to the GPU when a CPU
filter is requested. This is quite friendly for the user, but is of
course slow.

This approach is much slower than full hardware rendering but,
if you have SSE4.1 (MOVNTDQA), it uses a less CPU than full CPU
which is a gain compared to what we have now.
Every CPU for the last 10 years, have SSE4.1.

We've benched that on Linux (vaapi), Windows (D3D), macOS (VT), and we
have faster results than full CPU decoding, notably in fullHD.
However, it starts to be worse with 4K videos: only one thread (the vout
one) can't cope with 2 GPU/CPU copy.

c) We could also wait for all the filters to be in GPU/shaders, but
that's unrealistic
for this release, but should be a major goal of the next one.

d) Or just do nothing.

4/ Necessary user interface improvements

However, whatever we do, we need to mention this issue to the users.

Notably, the filters dialog must be reworked to mention when we're using
hw decoding, and we must not save by default the VLC filters. (#6873)

Maybe we can open the correct settings when the user click on the
to be able to reach easily the "hardware decoder" configuration.

5/ Opinions?

In my opinion, it's better to insert CPU filters, even if they could be
than to do nothing. With enough warnings, though.
This is by far the simplest for our users and our support.

Moreover, this would solve the different chromas issue (#14037, #13066,
for example). We should probably use I420 as the middle-man, though.

Of course this solution is not perfect, and we need, for the future VLC
releases, to focus on getting way more GPU filters.
But this will not be ready for 3.0.

What are your opinions on the matter?

Jean-Baptiste Kempf -  President
+33 672 704 734

More information about the vlc-devel mailing list