[x264-devel] Re: [patch] SSE2 pixel routines - new patch!
Loren Merritt
lorenm at u.washington.edu
Wed Jul 27 16:58:20 CEST 2005
On Tue, 26 Jul 2005, Alexander Izvorski wrote:
>
> The SSE2 patch is getting there, I'd like to propose this version as a
> candidate to be committed. It is the same speed or slightly slower
> on Athlon64, but noticeably faster on P4 or Xeon. Some benchmark
> results are here: http://www.firstmiletv.nl/vlc/x264/ (courtesy of
> Trax).
>
> I reorganized it into a separate file, would you prefer that or would
> you rather have it in the same file?
I would rather have it in the same file if that allows any macro reuse,
if not, I don't care.
> %macro SAD_INC_4x16P_SSE2 0
> movdqu xmm1, [ecx]
> movdqu xmm2, [ecx+ebx]
Shouldn't that be [ecx+edx] ?
It happens to work as-is because SAD is usually called between
two buffers of the same stride. (not true with --subme 1)
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list