[x265] [PATCH] asm : asm routine for chroma_p2s for 4:4:4 color space format
chen
chenm003 at 163.com
Wed Feb 19 19:18:58 CET 2014
At 2014-02-17 20:44:29,nabajit at multicorewareinc.com wrote:
># HG changeset patch
># User Nabajit Deka
># Date 1392641037 -19800
># Mon Feb 17 18:13:57 2014 +0530
># Node ID f5275ca8f2985bb0daf563738e6071b81967c2cd
># Parent ce96cdb390fe26aee6effa731e51303c1d9056b0
>asm : asm routine for chroma_p2s for 4:4:4 color space format
>
>+INIT_XMM ssse3
>+cglobal chroma_p2s_i444, 3, 7, 4
>+
>+ ; load width and height
>+ mov r3d, r3m
>+ mov r4d, r4m
>+
>+ ; load constant
>+ mova m2, [tab_c_128]
>+ mova m3, [tab_c_64_n64]
>+
>+.loopH:
>+
>+ xor r5d, r5d
>+.loopW:
>+ lea r6, [r0 + r5]
>+
>+ movh m0, [r6]
>+ punpcklbw m0, m2
>+ pmaddubsw m0, m3
>+
>+ movh m1, [r6 + r1]
>+ punpcklbw m1, m2
>+ pmaddubsw m1, m3
>+
>+ add r5d, 8
>+ cmp r5d, r3d
>+ lea r6, [r2 + r5 * 2]
>+ jg .width4
>+ movu [r6 + FENC_STRIDE * 0 - 16], m0
>+ movu [r6 + FENC_STRIDE * 2 - 16], m1
>+ je .nextH
>+ jmp .loopW
>+
>+.width4:
>+ test r3d, 4
>+ jz .width2
>+ test r3d, 2
>+ movh [r6 + FENC_STRIDE * 0 - 16], m0
>+ movh [r6 + FENC_STRIDE * 2 - 16], m1
>+ lea r6, [r6 + 8]
>+ pshufd m0, m0, 2
>+ pshufd m1, m1, 2
>+ jz .nextH
>+
>+.width2:
>+ movd [r6 + FENC_STRIDE * 0 - 16], m0
>+ movd [r6 + FENC_STRIDE * 2 - 16], m1
I think YUV444 no need width2 path, please check and confirm it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140220/7d7d814c/attachment.html>
More information about the x265-devel
mailing list