[x265] [PATCH 1 of 2] asm:intra pred planar32 sse2

dave dtyx265 at gmail.com
Thu Mar 12 23:40:10 CET 2015


On 03/12/2015 03:16 PM, chen wrote:
> I use 'pxor m7,m7' to replace your [pb_0], but it is same cycles in 
> IACA, the bottleneck on Port0
> Not sure how about performance on old CPU
I would have used something like that but there are no available 
registers by that point.  They are used up on holding other 
constants(pw_planar..) in the case of x86_64 and there just aren't 
enough in x86_32.  Performance on my old CPU seems unaffected by using 
constants in registers or from memory.


More information about the x265-devel mailing list