[x265] Fwd: [PATCH] asm: filter_vsp[4x4], filter_vss[4x4] in avx2: 407c->198c, 361c->180c

chen chenm003 at 163.com
Fri Mar 13 06:01:21 CET 2015


 


At 2015-03-13 12:04:25,"Divya Manivannan" <divya at multicorewareinc.com> wrote:



---------- Forwarded message ----------
From: chen<chenm003 at 163.com>
Date: Fri, Mar 13, 2015 at 4:06 AM
Subject: Re: [x265] [PATCH] asm: filter_vsp[4x4], filter_vss[4x4] in avx2: 407c->198c, 361c->180c
To: Development for x265 <x265-devel at videolan.org>



 

At 2015-03-12 14:03:21,"Divya Manivannan" <divya at multicorewareinc.com> wrote:
># HG changeset patch
># User Divya Manivannan <divya at multicorewareinc.com>
># Date 1426140136 -19800
>#      Thu Mar 12 11:32:16 2015 +0530
># Node ID ed3549b49cc488315da7d4709d6932e7244e5b33
># Parent  b931c50d55011a1ddc08f0a230b9632fcb4674d7
>asm: filter_vsp[4x4], filter_vss[4x4] in avx2: 407c->198c, 361c->180c
>
>diff -r b931c50d5501 -r ed3549b49cc4 source/common/x86/asm-primitives.cpp
>--- a/source/common/x86/asm-primitives.cpp	Wed Mar 11 21:58:02 2015 -0500
>+++ b/source/common/x86/asm-primitives.cpp	Thu Mar 12 11:32:16 2015 +0530
>@@ -1621,6 +1621,10 @@
>         p.chroma[X265_CSP_I420].pu[CHROMA_420_32x24].filter_vps = x265_interp_4tap_vert_ps_32x24_avx2;
>         p.chroma[X265_CSP_I420].pu[CHROMA_420_32x16].filter_vps = x265_interp_4tap_vert_ps_32x16_avx2;
>         p.chroma[X265_CSP_I420].pu[CHROMA_420_32x8].filter_vps = x265_interp_4tap_vert_ps_32x8_avx2;
>+
>+        p.chroma[X265_CSP_I420].pu[CHROMA_420_4x4].filter_vsp = x265_interp_4tap_vert_sp_4x4_avx2;
>+
>+        p.chroma[X265_CSP_I420].pu[CHROMA_420_4x4].filter_vss = x265_interp_4tap_vert_ss_4x4_avx2;
>     }
> #endif
> }
>diff -r b931c50d5501 -r ed3549b49cc4 source/common/x86/ipfilter8.asm
>--- a/source/common/x86/ipfilter8.asm	Wed Mar 11 21:58:02 2015 -0500
>+++ b/source/common/x86/ipfilter8.asm	Thu Mar 12 11:32:16 2015 +0530
>@@ -120,6 +120,31 @@
>                   times 4 dw -2, 10
>                   times 4 dw 58, -2
> 
>+ALIGN 32
>+pw_ChromaCoeffV:  times 8 dw 0, 64

same as tab_ChromaCoeffV, rename and increment times to 8 are better
[Divya] changing tab_ChromaCoeffV constant will affect the sse code so only defined the new constant.
ok, we may use your patch and modify in future
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150313/b291b755/attachment.html>


More information about the x265-devel mailing list