[x265] [PATCH 1 of 7] asm:intra_pred_ang4_3_sse2 improved ~4.5% 684.95 -> 654.99 with nits and tweaks

dave dtyx265 at gmail.com
Thu Apr 2 15:41:23 CEST 2015


On 04/01/2015 09:24 PM, chen wrote:
>
> At 2015-04-02 02:52:16,dtyx265 at gmail.com wrote:
> ># HG changeset patch
> ># User David T Yuen <dtyx265 at gmail.com>
> ># Date 1427891624 25200
> ># Node ID 529c6056ccfbce57cd845abec59a2af02812cd57
> ># Parent  89bc6238d4a2e3f117f0127e406c6dfbf093868b
> >asm:intra_pred_ang4_3_sse2 improved ~4.5% 684.95 -> 654.99 with nits and tweaks
> >
> >Changed r3 and r4 to r3d and r4d
> >Removed unnecessary pxor's
> >changed pshufd to psrldq
> >
> >diff -r 89bc6238d4a2 -r 529c6056ccfb source/common/x86/intrapred8.asm
> >--- a/source/common/x86/intrapred8.asm	Tue Mar 31 10:44:43 2015 -0700
> >+++ b/source/common/x86/intrapred8.asm	Wed Apr 01 05:33:44 2015 -0700
> >@@ -1339,10 +1339,10 @@
> >
> > INIT_XMM sse2
> > cglobal intra_pred_ang4_3, 3,5,8
> >-    mov         r4, 1
> >+    mov         r4d, 1
> >     cmp         r3m, byte 33
> >-    mov         r3, 9
> >-    cmove       r3, r4
> >+    mov         r3d, 9
> >+    cmove       r3d, r4d
> >
> >     movh        m0, [r2 + r3]   ; [8 7 6 5 4 3 2 1]
> >     mova        m1, m0
> >@@ -1368,7 +1368,6 @@
> > ALIGN 16
> > .do_filter4x4:
> >     pxor        m1, m1
> >-    pxor        m3, m3
> high part in m3 isn't zero
It doesn't need to be zero.  Whatever is there is overwritten by the 
next two instructions.
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150402/4bc599bd/attachment.html>


More information about the x265-devel mailing list