[x264-devel] Re: [patch] SSE2 pixel routines - new patch!

Loren Merritt lorenm at u.washington.edu
Wed Jul 27 16:58:20 CEST 2005


On Tue, 26 Jul 2005, Alexander Izvorski wrote:
>
> The SSE2 patch is getting there, I'd like to propose this version as a
> candidate to be committed.    It is the same speed or slightly slower
> on Athlon64, but noticeably faster on P4 or Xeon.  Some benchmark
> results are here: http://www.firstmiletv.nl/vlc/x264/ (courtesy of
> Trax).
>
> I reorganized it into a separate file, would you prefer that or would
> you rather have it in the same file?

I would rather have it in the same file if that allows any macro reuse, 
if not, I don't care.

> %macro SAD_INC_4x16P_SSE2 0
>     movdqu  xmm1,   [ecx]
>     movdqu  xmm2,   [ecx+ebx]

Shouldn't that be [ecx+edx] ?
It happens to work as-is because SAD is usually called between 
two buffers of the same stride. (not true with --subme 1)

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list