[x265] [PATCH 1 of 2] asm:intra pred planar32 sse2
chen
chenm003 at 163.com
Thu Mar 12 23:55:25 CET 2015
pxor didn't make uops, and m7 is temporary in your macro
At 2015-03-13 06:40:10,dave <dtyx265 at gmail.com> wrote:
>On 03/12/2015 03:16 PM, chen wrote:
>> I use 'pxor m7,m7' to replace your [pb_0], but it is same cycles in
>> IACA, the bottleneck on Port0
>> Not sure how about performance on old CPU
>I would have used something like that but there are no available
>registers by that point. They are used up on holding other
>constants(pw_planar..) in the case of x86_64 and there just aren't
>enough in x86_32. Performance on my old CPU seems unaffected by using
>constants in registers or from memory.
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150313/89012edd/attachment-0001.html>
More information about the x265-devel
mailing list