[x264-devel] Re: Compilation problem with x264 on Dual Opteron setup (SSE3)

Guillaume POIRIER poirierg at gmail.com
Tue May 2 22:15:47 CEST 2006


Hi,

On 5/1/06, Loren Merritt <lorenm at u.washington.edu> wrote:
> On Mon, 1 May 2006, Guillaume POIRIER wrote:
> > The main problem of that patch is that it unconditionally replaces all
> > movdqu with lddqu, which isn't very smart. Intel optimization guide
> > does state quite clearly that it's not how it should be done.
> > What should be done is: instrument the code in a way that can tell you
> > what are the loads that are always badly unaligned, and use lddqu only
> > in these cases (loads that are sometimes aligned, sometimes not do not
> > benefit from using lddqu).
>
> All the variants of SAD are unaligned, SATD and SSD are usually aligned.

Okay, here's an updated version of the patch that only uses lddqu in
sad routines: http://tuxrip.free.fr/transperl/MPlayer/SSE3_lddqu.3.diff

Please test and report if it helps a bit (I doubt it).

Guillaume
--
I am disillusioned enough to know that no man's opinion on any subject
is worth a damn unless backed up with enough genuine information to
make him really know what he's talking about.

-- H. P. Lovecraft (about the flamewars on FFmpeg and MPlayer-dev mailing lists)
http://www.brainyquote.com/quotes/quotes/h/hplovecr278144.html

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list