[x264-devel] question in pixel_avg2_w20_sse2

BugMaster BugMaster at narod.ru
Mon Jul 28 21:20:42 CEST 2014


On Tue, 29 Jul 2014 02:40:00 +0800 (CST), chen wrote:
> In pixel_avg2_w20_sse2, it mixed use XMM0-XMM4 and MM4-MM5, so MM4-MM5 was not save and restore.
> I check ABI document, it just said Microsoft Compiler didn't use MM0-MM7
> Is it a bug?
>  
> btw: I know the MM_ is double faster then XMM_ in old cpu, but in
> latest CPU, it is same speed or slower.
>  
> Min
>   
Hi. I not fully understand what was your real question and what you
see as bug here. Yes, this function use mix of SSE2/MMX
instructions/registers because we don't need full length XMM register
for this width (16+4) and we need unaligned memory access here. As for
calling ABI all MMX regs (mm0-mm7) are volatile and must be considered
destroyed on function calls.



More information about the x264-devel mailing list