[x265] [PATCH 1 of 3] asm: routines for chroma vsp filter functions for all block sizes

Nabajit Deka nabajit at multicorewareinc.com
Tue Nov 12 12:51:08 CET 2013


Thanks. I will modify this part in the next commit.


On Tue, Nov 12, 2013 at 5:05 PM, chen <chenm003 at 163.com> wrote:

>
> >+;-------------------------------------------------------------------------------------------------------------------
>
> >+; void interp_4tap_vertical_sp_%1x%2(int16_t *src, intptr_t srcStride, pixel *dst, intptr_t dstStride, int coeffIdx)
>
> >+;-------------------------------------------------------------------------------------------------------------------
> >+%macro FILTER_VER_CHROMA_SP_W2_4R 2
> >+INIT_XMM ssse3
> >+cglobal interp_4tap_vert_sp_%1x%2, 5, 7, 6
> >+
> >+    add       r1d, r1d
> >+    sub       r0, r1
> >+    shl       r4d, 5
> >+
> >+%ifdef PIC
> >+    lea       r5, [tab_ChromaCoeffV]
> >+    lea       r6, [r5 + r4]
> >+%else
> >+    lea       r6, [tab_ChromaCoeffV + r4]
> >+%endif
> >+
> >+    mova      m5, [tab_c_526336]
> >+
> >+    mov       r4d, (%2/4)
> >+
> >+.loopH
> >+    PROCESS_CHROMA_SP_W2_4R
> >+
> >+    paddd     m0, m5
> >+    paddd     m2, m5
> >+
> >+    psrad     m0, 12
> >+    psrad     m2, 12
> >+
> >+    packssdw  m0, m2
> >+    packuswb  m0, m0
> >+
> >+    pextrw    [r2], m0, 0
> SSE4.1 instruction
>
> >+    pextrw    [r2 + r3], m0, 1
> >+    pextrw    [r2 + 2 * r3], m0, 2
> >+    lea       r2, [r2 + 2 * r3]
> >+    pextrw    [r2 + r3], m0, 3
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20131112/20aea632/attachment.html>


More information about the x265-devel mailing list