[x264-devel] Re: Compilation problem with x264 on Dual Opteron setup (SSE3)

Loren Merritt lorenm at u.washington.edu
Mon May 1 23:15:11 CEST 2006


On Mon, 1 May 2006, Guillaume POIRIER wrote:
> The main problem of that patch is that it unconditionally replaces all
> movdqu with lddqu, which isn't very smart. Intel optimization guide
> does state quite clearly that it's not how it should be done.
> What should be done is: instrument the code in a way that can tell you
> what are the loads that are always badly unaligned, and use lddqu only
> in these cases (loads that are sometimes aligned, sometimes not do not
> benefit from using lddqu).

All the variants of SAD are unaligned, SATD and SSD are usually aligned.

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list