[vlc-devel] [PATCH] copy: fix the cache copy size

Quentin Chateau quentin.chateau at gmail.com
Thu Mar 7 22:20:25 CET 2019


Hi,

this regression has been picked-up by the Ubuntu 18.04 team and is
therefore affecting quite a big number of people. On a machine with a i7
8700k, you cannot even decently play a fullHD video using vaapi (over half
of the frames are dropped). The proposed fix restores the expected
performances (low CPU usage and no frame drop for 4K60 HDR videos using
vaapi).

If there is anything wrong about my patch, please tell and i'll fix it.

Quentin C.

Le ven. 1 mars 2019 à 23:36, Quentin Chateau <quentin.chateau at gmail.com> a
écrit :

> regression 09d421a20851e1c49aa98e117957dd118620fae4
> ---
>  modules/video_chroma/copy.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/modules/video_chroma/copy.c b/modules/video_chroma/copy.c
> index e9250b948e..51498f4a06 100644
> --- a/modules/video_chroma/copy.c
> +++ b/modules/video_chroma/copy.c
> @@ -468,7 +468,7 @@ static void SSE_CopyPlane(uint8_t *dst, size_t
> dst_pitch,
>      const size_t copy_pitch = __MIN(src_pitch, dst_pitch);
>      const unsigned w16 = (copy_pitch+15) & ~15;
>      const unsigned hstep = cache_size / w16;
> -    const unsigned cache_width = __MIN(src_pitch, hstep);
> +    const unsigned cache_width = __MIN(src_pitch, cache_size);
>      assert(hstep > 0);
>
>      /* If SSE4.1: CopyFromUswc is faster than memcpy */
> @@ -501,8 +501,8 @@ SSE_InterleavePlanes(uint8_t *dst, size_t dst_pitch,
>      size_t copy_pitch = __MIN(dst_pitch / 2, srcu_pitch);
>      unsigned int const  w16 = (srcu_pitch+15) & ~15;
>      unsigned int const  hstep = (cache_size) / (2*w16);
> -    const unsigned cacheu_width = __MIN(srcu_pitch, hstep);
> -    const unsigned cachev_width = __MIN(srcv_pitch, hstep);
> +    const unsigned cacheu_width = __MIN(srcu_pitch, cache_size);
> +    const unsigned cachev_width = __MIN(srcv_pitch, cache_size);
>      assert(hstep > 0);
>
>      for (unsigned int y = 0; y < height; y += hstep)
> @@ -535,7 +535,7 @@ static void SSE_SplitPlanes(uint8_t *dstu, size_t
> dstu_pitch,
>      size_t copy_pitch = __MIN(__MIN(src_pitch / 2, dstu_pitch),
> dstv_pitch);
>      const unsigned w16 = (src_pitch+15) & ~15;
>      const unsigned hstep = cache_size / w16;
> -    const unsigned cache_width = __MIN(src_pitch, hstep);
> +    const unsigned cache_width = __MIN(src_pitch, cache_size);
>      assert(hstep > 0);
>
>      for (unsigned y = 0; y < height; y += hstep) {
> --
> 2.19.1
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20190307/279283bd/attachment.html>


More information about the vlc-devel mailing list