[x265] [PATCH] denoiseDct: SSE version of asm code

chen chenm003 at 163.com
Wed Sep 17 17:24:54 CEST 2014


 

At 2014-09-17 19:33:16,praveen at multicorewareinc.com wrote:
># HG changeset patch
># User Praveen Tiwari
># Date 1410953432 -19800
># Node ID e919c3dde6bd9a3b74177e48a14e8b151556caee
># Parent  de0b737ed7165b4739128ee430f259ea0f8a5e81
>denoiseDct: SSE version of asm code
>
>+;-----------------------------------------------------------------------------
>+; void denoise_dct(int32_t *dct, uint32_t *sum, uint16_t *offset, int size)
>+;-----------------------------------------------------------------------------
>+INIT_XMM sse4
>+cglobal denoise_dct, 4, 4, 6
>+    pxor     m5,  m5
>+    shr      r3d, 2
>+.loop:
>+    mova     m0, [r0]
>+    pabsd    m1, m0
>+    mova     m2, [r1]
>+    paddd    m2, m1
>+    mova     [r1], m2
>+    movh     m2, [r2]
>+    pmovzxwd m3, m2

pmovzx didn't need alignment address
>+    psubd    m1, m3
>+    pcmpgtd  m4, m1, m5
>+    pand     m1, m4
>+    psignd   m1, m0
>+    mova     [r0], m1
>+    add      r0, 16
>+    add      r1, 16
>+    add      r2, 8
>+    dec      r3d
>+    jg .loop

jnz
 
 
this version is similar to origin x264 version, just ABSD vs pabsd and PSIGND vs psignd, maybe our macro have some issue
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140917/76cb9483/attachment.html>


More information about the x265-devel mailing list