<div dir="ltr"><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">chen</b> <span dir="ltr"><<a href="mailto:chenm003@163.com">chenm003@163.com</a>></span><br>Date: Fri, Mar 13, 2015 at 4:06 AM<br>Subject: Re: [x265] [PATCH] asm: filter_vsp[4x4], filter_vss[4x4] in avx2: 407c->198c, 361c->180c<br>To: Development for x265 <<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a>><br><br><br><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><div> </div><pre><br>At 2015-03-12 14:03:21,"Divya Manivannan" <<a href="mailto:divya@multicorewareinc.com" target="_blank">divya@multicorewareinc.com</a>> wrote:
># HG changeset patch
># User Divya Manivannan <<a href="mailto:divya@multicorewareinc.com" target="_blank">divya@multicorewareinc.com</a>>
># Date 1426140136 -19800
># Thu Mar 12 11:32:16 2015 +0530
># Node ID ed3549b49cc488315da7d4709d6932e7244e5b33
># Parent b931c50d55011a1ddc08f0a230b9632fcb4674d7
>asm: filter_vsp[4x4], filter_vss[4x4] in avx2: 407c->198c, 361c->180c
>
>diff -r b931c50d5501 -r ed3549b49cc4 source/common/x86/asm-primitives.cpp
>--- a/source/common/x86/asm-primitives.cpp Wed Mar 11 21:58:02 2015 -0500
>+++ b/source/common/x86/asm-primitives.cpp Thu Mar 12 11:32:16 2015 +0530
>@@ -1621,6 +1621,10 @@
> p.chroma[X265_CSP_I420].pu[CHROMA_420_32x24].filter_vps = x265_interp_4tap_vert_ps_32x24_avx2;
> p.chroma[X265_CSP_I420].pu[CHROMA_420_32x16].filter_vps = x265_interp_4tap_vert_ps_32x16_avx2;
> p.chroma[X265_CSP_I420].pu[CHROMA_420_32x8].filter_vps = x265_interp_4tap_vert_ps_32x8_avx2;
>+
>+ p.chroma[X265_CSP_I420].pu[CHROMA_420_4x4].filter_vsp = x265_interp_4tap_vert_sp_4x4_avx2;
>+
>+ p.chroma[X265_CSP_I420].pu[CHROMA_420_4x4].filter_vss = x265_interp_4tap_vert_ss_4x4_avx2;
> }
> #endif
> }
>diff -r b931c50d5501 -r ed3549b49cc4 source/common/x86/ipfilter8.asm
>--- a/source/common/x86/ipfilter8.asm Wed Mar 11 21:58:02 2015 -0500
>+++ b/source/common/x86/ipfilter8.asm Thu Mar 12 11:32:16 2015 +0530
>@@ -120,6 +120,31 @@
> times 4 dw -2, 10
> times 4 dw 58, -2
>
>+ALIGN 32
>+pw_ChromaCoeffV: times 8 dw 0, 64
</pre><pre>same as tab_ChromaCoeffV, rename and increment times to 8 are better</pre><pre>[Divya] changing tab_ChromaCoeffV constant will affect the sse code so only defined the new constant.</pre><pre>other is right</pre></div><br>_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br>
<br></div><br></div>