[x265] [PATCH 1 of 2] asm:intra pred planar32 sse2
dave
dtyx265 at gmail.com
Thu Mar 12 23:40:10 CET 2015
On 03/12/2015 03:16 PM, chen wrote:
> I use 'pxor m7,m7' to replace your [pb_0], but it is same cycles in
> IACA, the bottleneck on Port0
> Not sure how about performance on old CPU
I would have used something like that but there are no available
registers by that point. They are used up on holding other
constants(pw_planar..) in the case of x86_64 and there just aren't
enough in x86_32. Performance on my old CPU seems unaffected by using
constants in registers or from memory.
More information about the x265-devel
mailing list