[vlc-devel] [PATCH 2/2][RFC] VAAPI XCB video output

Petri Hintukainen phintuka at gmail.com
Thu Aug 25 11:57:14 CEST 2016


On ti, 2016-08-23 at 20:16 +0200, Jean-Baptiste Kempf wrote:
> On 23 Aug, Petri Hintukainen wrote :
> > 
> > ---
> >  modules/MODULES_LIST        |   2 +
> >  modules/Makefile.am         |   1 +
> >  modules/hw/va/Makefile.am   |  19 ++
> >  modules/hw/va/chroma.c      | 187 ++++++++++++++++
> >  modules/hw/va/vlc_va.c      | 528
> > ++++++++++++++++++++++++++++++++++++++++++++
> >  modules/hw/va/vlc_va.h      | 152 +++++++++++++
> >  modules/hw/va/xcb_display.c | 485
> > ++++++++++++++++++++++++++++++++++++++++
> >  7 files changed, 1374 insertions(+)
> Missing NEWS and po/POTFILES.in

OK, added for next version

> > 
> > +include hw/va/Makefile.am
> >  include hw/vdpau/Makefile.am
> Do we want more Makefile.am here? Rémi?
> 
> (btw, why is not hw/mmal/ listed?)

mmal is in SUBDIRS

> 
> > 
> > +static picture_t *UploadSurface(filter_t *filter, picture_t *src)
> > +{
> > +    filter_sys_t  *sys = filter->p_sys;
> > +    VAStatus       status;
> > +    picture_t     *dst = NULL;
> > +    picture_sys_t *picsys;
> > +
> > +    dst = filter_NewPicture(filter);
> > +    if (dst == NULL) {
> > +        msg_Err(filter, "filter_NewPicture failed\n");
> \n is unnecessary.

Yes. Removed from all log messages.

> OK.
> 
> > 
> > +static int OutputOpen(vlc_object_t *obj)
> > +{
> > +    if (filter->fmt_in.video.orientation != filter-
> > >fmt_out.video.orientation) {
> > +        return VLC_EGENERIC;
> Any simple way of handling rotations?

Rotation is handled in output module. It could be handled in filter
too, but it is simpler this way (at least until some "real" filtering
is added).

Maybe other transforms should be handled here (there's only rotation
support in VA-API). But how common are those in "real life" ?

> > 
> > +    if (status != VA_STATUS_SUCCESS) {
> > +        msg_Err(filter, "vlc_va_TestPutImage() failed: %d\n",
> > status);
> > +        goto error;
> as above.
> 
> > 
> > +/*****************************************************************
> > ************
> > + * vlc_va.c: VAAPI helper for VLC
> > +
> > *******************************************************************
> > **********
> 
> > 
> > +    msg_Info(o, "VA-API: v%d.%d (%s)\n", major, minor, vendor);
> I'd argue this requires a better message.

Um... Like "Trying to use ... for decoding and/or rendering" ?

> > 
> > +picture_pool_t *vlc_va_PoolAlloc(vlc_object_t *o, VADisplay
> > va_dpy, unsigned requested_count,
> > +                                 const video_format_t *restrict
> > fmt, unsigned int va_rt_format)
> > +{
> > +    picture_t   *pics[requested_count];
> > +    VASurfaceID  va_surface_ids[requested_count];
> > +    VAStatus     status;
> > +    unsigned     count;
> > +
> > +    status = vaCreateSurfaces(va_dpy, va_rt_format,
> > +                              fmt->i_visible_width, fmt-
> > >i_visible_height,
> > +                              va_surface_ids, requested_count,
> > NULL, 0);
> > +    if (status != VA_STATUS_SUCCESS) {
> > +        msg_Err(o, "vaCreateSurfaces(%d) failed: %d\n",
> > va_rt_format, status);
> idem.
> 
> > 
> > +#if 0
> > +    /* too late to test here ? */
> > +    status = vlc_va_TestPutImage(va_dpy, &va_format,
> > +                                 va_surface_ids[0], NULL,
> > +                                 fmt->i_visible_width, fmt-
> > >i_visible_height);
> > +#endif
> Why would it be too late?

Picture pool is not created in Open(), so vout is already successfully
created when we enter here.

Currently the test is in upload filter, but that won't be used with HW
decoding.

> > 
> > +static int CopyPicture(vlc_object_t *o,
> > +                       VAImage *va_image, uint8_t *base,
> > +                       int dst_x, int dst_y, const picture_t *src)
> > +{
> > +    plane_t dst_planes[3];
> 3?
> 
> If so, don't we need to assert on src->i_planes to be <= 3?

Right.
Or check the format first.

> > 
> > +
> > +    for (int i = 0; i < src->i_planes; i++) {
> > +        dst_planes[i].p_pixels        = base + va_image-
> > >offsets[i];
> > +        dst_planes[i].i_pitch         = va_image->pitches[i];
> > +        dst_planes[i].i_visible_pitch = va_image->pitches[i];
> > +        dst_planes[i].i_lines         = src->p[i].i_visible_lines;
> > +        dst_planes[i].i_visible_lines = src->p[i].i_visible_lines;
> > +        dst_planes[i].i_pixel_pitch   = src->p[i].i_pixel_pitch;
> > +    }
> > +
> > +    if (src->format.i_chroma == VLC_CODEC_I420 ||
> > +        src->format.i_chroma == VLC_CODEC_I422 ||
> > +        src->format.i_chroma == VLC_CODEC_I444) {
> > +
> > +        plane_t tmp = dst_planes[1];
> > +        dst_planes[1] = dst_planes[2];
> > +        dst_planes[2] = tmp;
> > +    }
> > +
> > +    switch (va_image->format.fourcc) {
> > +        case VA_FOURCC_ARGB:
> > +        case VA_FOURCC_RGBA:
> > +            dst_planes[0].p_pixels += dst_x * 4 + dst_y *
> > dst_planes[0].i_pitch;
> > +            break;
> > +
> > +        case VA_FOURCC_IYUV:
> > +        case VA_FOURCC_YV12:
> > +            dst_planes[0].p_pixels += dst_x     + dst_y     *
> > dst_planes[0].i_pitch;
> > +            dst_planes[1].p_pixels += dst_x / 2 + dst_y / 2 *
> > dst_planes[1].i_pitch;
> > +            dst_planes[2].p_pixels += dst_x / 2 + dst_y / 2 *
> > dst_planes[2].i_pitch;
> > +            break;
> > +
> > +        default:
> > +            msg_Err(o, "Unsupported va fourcc (%4.4s)", (const
> > char *)&va_image->format.fourcc);
> > +            return VA_STATUS_ERROR_UNIMPLEMENTED;
> > +    }
> > +
> > +    for (int i = 0; i < src->i_planes; i++) {
> > +        plane_CopyPixels(&dst_planes[i], &src->p[i]);
> > +    }
> memcpy is murder :)

Even double one here. vaDeriveImage() may return mapped GPU memory, so
"normal" memcpy is really slow. I can add optimized memcpy later.

But there's no way to avoid copying here. Maybe it would be possible
when filters can provide picture pool. But, still, decoding directly to
GPU memory would kill SW decoder performance.

> > 
> > +
> > +    return VA_STATUS_SUCCESS;
> > +}
> > +
> > +int vlc_va_PutSurface(vlc_object_t *o, VADisplay va_dpy,
> > +                      VASurfaceID va_surface_id,
> > +                      VAImageFormat *va_image_format, const
> > picture_t *src,
> > +                      int in_width, int in_height, int out_width,
> > int out_height)
> > +{
> > +    VAImage   surface_image;
> > +    VAStatus  status;
> > +    uint8_t  *base;
> > +    int       derived = 0;
> > +
> > +    /* create VAAPI image */
> > +
> > +    /* try DeriveImage if no scaling required */
> > +    if (in_width == out_width && in_height == out_height) {
> > +
> > +        status = vaDeriveImage(va_dpy, va_surface_id,
> > &surface_image);
> > +        derived = (status == VA_STATUS_SUCCESS);
> > +    }
> > +    if (!derived) {
> > +        /* fall back to PutImage */
> > +        status = vaCreateImage(va_dpy, va_image_format, in_width,
> > in_height, &surface_image);
> > +        if (status != VA_STATUS_SUCCESS) {
> > +            msg_Err(o, "vaCreateImage(0x%x) failed\n",
> > va_image_format->fourcc);
> > +            return status;
> > +        }
> > +    }
> > +
> > +    /* copy bits */
> > +
> > +    status = vaMapBuffer(va_dpy, surface_image.buf, (void
> > **)&base);
> > +    if (status != VA_STATUS_SUCCESS) {
> > +        msg_Err(o, "vaMapBuffer() failed\n");
> idem
> 
> > 
> > +    status = CopyPicture(o, &surface_image, base, 0, 0, src);
> > +    vaUnmapBuffer(va_dpy, surface_image.buf);
> > +    if (status != VA_STATUS_SUCCESS) {
> > +        goto out;
> > +    }
> > +
> > +    if (!derived) {
> > +        status = vaPutImage(va_dpy, va_surface_id,
> > surface_image.image_id,
> > +                            0, 0, in_width, in_height,
> > +                            0, 0, out_width, out_height);
> > +        if (status != VA_STATUS_SUCCESS) {
> > +            msg_Err(o, "vaPutImage(0x%x) failed\n",
> > va_image_format->fourcc);
> > +            //goto out;
> > +        }
> > +    }
> > +
> > + out:
> > +    vaDestroyImage(va_dpy, surface_image.image_id);
> > +    return status;
> > +}
> > +
> > 
> > +/*****************************************************************
> > ************
> > + * vlc_va.h: VAAPI helper for VLC
> > +
> > *******************************************************************
> > **********
> > +typedef struct {
> > +    unsigned int x;
> > +    unsigned int y;
> > +    unsigned int w;
> > +    unsigned int h;
> > +} vlc_va_rect;
> subpicture_region_t ?

subpicture_region_t is over-complex, and I didn't find any (existing)
simpler alternatives ...

This is the area covering all regions of subpicture (size of the image
that will be rendered). Looks like VA-API supports only single
subpicture, so the regions must be merged before blending.

> > 
> > +
> > +typedef struct {
> > +    vlc_va_rect    place; /* may be different than VAImage
> > dimensions */
> > +
> > +    VASubpictureID va_subpicture_id;
> > +    VAImage        va_image;
> > +} vlc_va_subpicture;
> 
> > 
> > +/*****************************************************************
> > ************
> > + * xcb_display.c: VAAPI XCB display
> > +
> > *******************************************************************
> > **********
> > + * Copyright (C) 2016 VLC authors and VideoLAN
> > + *
> > + * Authors: Petri Hintukainen <phintuka at gmail.com>
> > + *          Rémi Denis-Courmont
> > + *
> > + * This program is free software; you can redistribute it and/or
> > modify it
> > + * under the terms of the GNU Lesser General Public License as
> > published by
> > + * the Free Software Foundation; either version 2.1 of the
> > License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General
> > Public License
> > + * along with this program; if not, write to the Free Software
> > Foundation,
> > + * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301,
> > USA.
> > +
> > *******************************************************************
> > **********/
> > +
> > +#ifdef HAVE_CONFIG_H
> > +# include <config.h>
> > +#endif
> > +
> > +#include <stdlib.h>
> > +#include <assert.h>
> > +
> > +#include <xcb/xcb.h>
> > +
> > +#include <vlc_common.h>
> > +#include <vlc_plugin.h>
> > +#include <vlc_vout_display.h>
> > +#include <vlc_picture_pool.h>
> > +#include <vlc_xlib.h>
> > +
> > +#include "events.h"
> > +
> > +#include <va/va.h>
> > +#include <va/va_x11.h>
> > +
> > +#include "vlc_va.h"
> > +
> > +static int Open(vlc_object_t *);
> > +static void Close(vlc_object_t *);
> > +
> > +vlc_module_begin()
> > +    set_shortname(N_("VAAPI XCB"))
> > +    set_description(N_("VA-API video output (XCB)"))
> > +    set_category(CAT_VIDEO)
> > +    set_subcategory(SUBCAT_VIDEO_VOUT)
> > +    set_capability("vout display", 3000)
> Can't we merge this with the classical XCB output?

Should be possible with some ifdefs and runtime checks. Most of va
specific code is in the helper functions. It should be quite "easy" to
drop to other vout plugins too (ex. OpenGL).

But that will add runtime dependency for libva.

> Also, shouldn't this be a different patch?

If you prefer so.

But note that those modules are useless alone (can't be even tested).

I'll send new version after I've tested how well HW decoding integrates
to all this.

> 
> With my kindest regards,
> 


More information about the vlc-devel mailing list