[x265] [PATCH] asm : asm routine for chroma_p2s for 4:4:4 color space format

chen chenm003 at 163.com
Wed Feb 19 19:18:58 CET 2014


At 2014-02-17 20:44:29,nabajit at multicorewareinc.com wrote:
># HG changeset patch
># User Nabajit Deka
># Date 1392641037 -19800
>#      Mon Feb 17 18:13:57 2014 +0530
># Node ID f5275ca8f2985bb0daf563738e6071b81967c2cd
># Parent  ce96cdb390fe26aee6effa731e51303c1d9056b0
>asm : asm routine for chroma_p2s for 4:4:4 color space format
>
>+INIT_XMM ssse3
>+cglobal chroma_p2s_i444, 3, 7, 4
>+
>+    ; load width and height
>+    mov         r3d, r3m
>+    mov         r4d, r4m
>+
>+    ; load constant
>+    mova        m2, [tab_c_128]
>+    mova        m3, [tab_c_64_n64]
>+
>+.loopH:
>+
>+    xor         r5d, r5d
>+.loopW:
>+    lea         r6, [r0 + r5]
>+
>+    movh        m0, [r6]
>+    punpcklbw   m0, m2
>+    pmaddubsw   m0, m3
>+
>+    movh        m1, [r6 + r1]
>+    punpcklbw   m1, m2
>+    pmaddubsw   m1, m3
>+
>+    add         r5d, 8
>+    cmp         r5d, r3d
>+    lea         r6, [r2 + r5 * 2]
>+    jg          .width4
>+    movu        [r6 + FENC_STRIDE * 0 - 16], m0
>+    movu        [r6 + FENC_STRIDE * 2 - 16], m1
>+    je          .nextH
>+    jmp         .loopW
>+
>+.width4:
>+    test        r3d, 4
>+    jz          .width2
>+    test        r3d, 2
>+    movh        [r6 + FENC_STRIDE * 0 - 16], m0
>+    movh        [r6 + FENC_STRIDE * 2 - 16], m1
>+    lea         r6, [r6 + 8]
>+    pshufd      m0, m0, 2
>+    pshufd      m1, m1, 2
>+    jz          .nextH
>+
>+.width2:
>+    movd        [r6 + FENC_STRIDE * 0 - 16], m0
>+    movd        [r6 + FENC_STRIDE * 2 - 16], m1
I think YUV444 no need width2 path, please check and confirm it.
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140220/7d7d814c/attachment.html>


More information about the x265-devel mailing list