[vlc-devel] [PATCH] arm_neon: Add an optimized routine for NV12/21 to I420/YV12

Tue Oct 1 13:55:17 CEST 2013

Hi,

Le mardi 01 octobre 2013 à 01:31:29, Martin Storsjö a écrit :
> +static void copy_y_plane(filter_t *filter, picture_t *src, picture_t *dst)
> +{
> +    uint8_t *src_y = src->Y_PIXELS;
> +    uint8_t *dst_y = dst->Y_PIXELS;
> +    if (src->Y_PITCH == dst->Y_PITCH) {
> +        memcpy(dst_y, src_y, dst->Y_PITCH * filter->fmt_in.video.i_height);
> +    } else {
> +        for (unsigned y = 0; y < filter->fmt_in.video.i_height; y++) {
> +            memcpy(dst_y + dst->Y_PITCH * y, src_y + src->Y_PITCH * y,
> +                   filter->fmt_in.video.i_width);
While the compiler should be able to optimize that itself, it might be
worth to avoid the multiplications with something like:
           for ( unsigned y = 0;
                 y < filter->fmt_in.video.i_height;
                 y++, dst_y += dst->Y_PITCH, src_y += src->Y_PITCH )
               memcpy(dst_y, src_y, filter->fmt_in.video.i_width)
               
> +        }
> +    }
> +}

Regards,

-- 
Denis Charmet - TypX
Le mauvais esprit est un art de vivre