[x265] [PATCH] denoiseDct: SSE version of asm code
chen
chenm003 at 163.com
Wed Sep 17 17:24:54 CEST 2014
At 2014-09-17 19:33:16,praveen at multicorewareinc.com wrote:
># HG changeset patch
># User Praveen Tiwari
># Date 1410953432 -19800
># Node ID e919c3dde6bd9a3b74177e48a14e8b151556caee
># Parent de0b737ed7165b4739128ee430f259ea0f8a5e81
>denoiseDct: SSE version of asm code
>
>+;-----------------------------------------------------------------------------
>+; void denoise_dct(int32_t *dct, uint32_t *sum, uint16_t *offset, int size)
>+;-----------------------------------------------------------------------------
>+INIT_XMM sse4
>+cglobal denoise_dct, 4, 4, 6
>+ pxor m5, m5
>+ shr r3d, 2
>+.loop:
>+ mova m0, [r0]
>+ pabsd m1, m0
>+ mova m2, [r1]
>+ paddd m2, m1
>+ mova [r1], m2
>+ movh m2, [r2]
>+ pmovzxwd m3, m2
pmovzx didn't need alignment address
>+ psubd m1, m3
>+ pcmpgtd m4, m1, m5
>+ pand m1, m4
>+ psignd m1, m0
>+ mova [r0], m1
>+ add r0, 16
>+ add r1, 16
>+ add r2, 8
>+ dec r3d
>+ jg .loop
jnz
this version is similar to origin x264 version, just ABSD vs pabsd and PSIGND vs psignd, maybe our macro have some issue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140917/76cb9483/attachment.html>
More information about the x265-devel
mailing list