[x264-devel] question in pixel_avg2_w20_sse2
BugMaster
BugMaster at narod.ru
Mon Jul 28 21:20:42 CEST 2014
On Tue, 29 Jul 2014 02:40:00 +0800 (CST), chen wrote:
> In pixel_avg2_w20_sse2, it mixed use XMM0-XMM4 and MM4-MM5, so MM4-MM5 was not save and restore.
> I check ABI document, it just said Microsoft Compiler didn't use MM0-MM7
> Is it a bug?
>
> btw: I know the MM_ is double faster then XMM_ in old cpu, but in
> latest CPU, it is same speed or slower.
>
> Min
>
Hi. I not fully understand what was your real question and what you
see as bug here. Yes, this function use mix of SSE2/MMX
instructions/registers because we don't need full length XMM register
for this width (16+4) and we need unaligned memory access here. As for
calling ABI all MMX regs (mm0-mm7) are volatile and must be considered
destroyed on function calls.
More information about the x264-devel
mailing list