[x264-devel] Re: Compilation problem with x264 on Dual Opteron setup (SSE3)
Guillaume POIRIER
poirierg at gmail.com
Tue May 2 22:15:47 CEST 2006
Hi,
On 5/1/06, Loren Merritt <lorenm at u.washington.edu> wrote:
> On Mon, 1 May 2006, Guillaume POIRIER wrote:
> > The main problem of that patch is that it unconditionally replaces all
> > movdqu with lddqu, which isn't very smart. Intel optimization guide
> > does state quite clearly that it's not how it should be done.
> > What should be done is: instrument the code in a way that can tell you
> > what are the loads that are always badly unaligned, and use lddqu only
> > in these cases (loads that are sometimes aligned, sometimes not do not
> > benefit from using lddqu).
>
> All the variants of SAD are unaligned, SATD and SSD are usually aligned.
Okay, here's an updated version of the patch that only uses lddqu in
sad routines: http://tuxrip.free.fr/transperl/MPlayer/SSE3_lddqu.3.diff
Please test and report if it helps a bit (I doubt it).
Guillaume
--
I am disillusioned enough to know that no man's opinion on any subject
is worth a damn unless backed up with enough genuine information to
make him really know what he's talking about.
-- H. P. Lovecraft (about the flamewars on FFmpeg and MPlayer-dev mailing lists)
http://www.brainyquote.com/quotes/quotes/h/hplovecr278144.html
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list