[x265] [PATCH 1 of 2] asm:intra pred planar32 sse2

chen chenm003 at 163.com
Thu Mar 12 23:55:25 CET 2015


pxor didn't make uops, and m7 is temporary in your macro

At 2015-03-13 06:40:10,dave <dtyx265 at gmail.com> wrote:
>On 03/12/2015 03:16 PM, chen wrote:
>> I use 'pxor m7,m7' to replace your [pb_0], but it is same cycles in 
>> IACA, the bottleneck on Port0
>> Not sure how about performance on old CPU
>I would have used something like that but there are no available 
>registers by that point.  They are used up on holding other 
>constants(pw_planar..) in the case of x86_64 and there just aren't 
>enough in x86_32.  Performance on my old CPU seems unaffected by using 
>constants in registers or from memory.
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150313/89012edd/attachment-0001.html>


More information about the x265-devel mailing list