[x264-devel] Re: Compilation problem with x264 on Dual Opteron setup (SSE3)
Loren Merritt
lorenm at u.washington.edu
Mon May 1 23:15:11 CEST 2006
On Mon, 1 May 2006, Guillaume POIRIER wrote:
> The main problem of that patch is that it unconditionally replaces all
> movdqu with lddqu, which isn't very smart. Intel optimization guide
> does state quite clearly that it's not how it should be done.
> What should be done is: instrument the code in a way that can tell you
> what are the loads that are always badly unaligned, and use lddqu only
> in these cases (loads that are sometimes aligned, sometimes not do not
> benefit from using lddqu).
All the variants of SAD are unaligned, SATD and SSD are usually aligned.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list