[x264-devel] x86: detect Bobcat, improve Atom optimizations, reorganize flags
Jason Garrett-Glaser
git at videolan.org
Wed Feb 27 00:18:05 CET 2013
x264 | branch: master | Jason Garrett-Glaser <jason at x264.com> | Sat Feb 2 12:37:08 2013 -0800| [4f24bb34453fdedefd161063e20516d148b80f8b] | committer: Jason Garrett-Glaser
x86: detect Bobcat, improve Atom optimizations, reorganize flags
The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this
and apply the appropriate flags.
It also has an extremely slow palignr instruction; create a flag for this to
avoid massive penalties on palignr-heavy functions.
Improve Atom function selection and document exactly what the SLOW_ATOM flag
covers.
Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3
optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on
Atom along with other SIMD multiplies.
Drop TBM detection; it'll probably never be useful for x264.
Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe).
Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=4f24bb34453fdedefd161063e20516d148b80f8b
---
common/common.c | 4 +-
common/cpu.c | 107 +++++++++++++++++++++----------------
common/dct.c | 42 +++++++++------
common/frame.c | 6 ++-
common/pixel.c | 31 ++++++++---
common/x86/mc-c.c | 115 ++++++++++++++++++++++------------------
common/x86/pixel-a.asm | 138 ++++++++++++++++++++++++++----------------------
common/x86/pixel.h | 14 +++--
common/x86/predict-c.c | 10 ++--
encoder/encoder.c | 37 +++++++++----
tools/checkasm.c | 38 +++++++------
x264.h | 84 ++++++++++++++++-------------
12 files changed, 367 insertions(+), 259 deletions(-)
Diff: http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=4f24bb34453fdedefd161063e20516d148b80f8b
More information about the x264-devel
mailing list