[x265] [PATCH] asm: interp_8tap_vert_pX sse2

dave dtyx265 at gmail.com
Fri May 29 20:53:54 CEST 2015


FYI,  If what I submitted performs better than the sse4 code then I 
suggest either improving the sse4 code with ssse3 and sse4 instructions 
or removing it.

On 05/29/2015 10:12 AM, chen wrote:
> right,thanks
>
> .
> At 2015-05-30 01:01:15,dtyx265 at gmail.com wrote:
> ># HG changeset patch
> ># User David T Yuen <dtyx265 at gmail.com>
> ># Date 1432917446 25200
> ># Node ID 2d5efe979f6b9c8db275ecb53767e4bcff1da659
> ># Parent  12f0ed28ba0eb29f2df0bb8adbc5f3cfb40a6361
> >asm: interp_8tap_vert_pX sse2
> >
> >This code replaces c code for sse2.  It is the combination of the sse4 macros into
> >one for smaller code size with no sacrifice in function and a few tweeks for performance.
> >The original sse4 macros only use up to sse2 code so this code may perform better with the
> >tweeks which include unrolling the inner loop which eliminated the need to use the stack
> >to hold the counter for one of the loops and replaced incrementing the source register
> >with address offsets.
> >
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150529/5d305525/attachment.html>


More information about the x265-devel mailing list