[x265] [PATCH 1 of 7] asm:intra_pred_ang4_3_sse2 improved ~4.5% 684.95 -> 654.99 with nits and tweaks
dave
dtyx265 at gmail.com
Thu Apr 2 15:41:23 CEST 2015
On 04/01/2015 09:24 PM, chen wrote:
>
> At 2015-04-02 02:52:16,dtyx265 at gmail.com wrote:
> ># HG changeset patch
> ># User David T Yuen <dtyx265 at gmail.com>
> ># Date 1427891624 25200
> ># Node ID 529c6056ccfbce57cd845abec59a2af02812cd57
> ># Parent 89bc6238d4a2e3f117f0127e406c6dfbf093868b
> >asm:intra_pred_ang4_3_sse2 improved ~4.5% 684.95 -> 654.99 with nits and tweaks
> >
> >Changed r3 and r4 to r3d and r4d
> >Removed unnecessary pxor's
> >changed pshufd to psrldq
> >
> >diff -r 89bc6238d4a2 -r 529c6056ccfb source/common/x86/intrapred8.asm
> >--- a/source/common/x86/intrapred8.asm Tue Mar 31 10:44:43 2015 -0700
> >+++ b/source/common/x86/intrapred8.asm Wed Apr 01 05:33:44 2015 -0700
> >@@ -1339,10 +1339,10 @@
> >
> > INIT_XMM sse2
> > cglobal intra_pred_ang4_3, 3,5,8
> >- mov r4, 1
> >+ mov r4d, 1
> > cmp r3m, byte 33
> >- mov r3, 9
> >- cmove r3, r4
> >+ mov r3d, 9
> >+ cmove r3d, r4d
> >
> > movh m0, [r2 + r3] ; [8 7 6 5 4 3 2 1]
> > mova m1, m0
> >@@ -1368,7 +1368,6 @@
> > ALIGN 16
> > .do_filter4x4:
> > pxor m1, m1
> >- pxor m3, m3
> high part in m3 isn't zero
It doesn't need to be zero. Whatever is there is overwritten by the
next two instructions.
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150402/4bc599bd/attachment.html>
More information about the x265-devel
mailing list