[x264-devel] x86: Remove X264_CPU_SSE_MISALIGN functions

Henrik Gramner git at videolan.org
Fri Aug 23 23:06:31 CEST 2013


x264 | branch: master | Henrik Gramner <henrik at gramner.com> | Fri Jul  5 21:15:43 2013 +0200| [0c738e30ec025f0effdb62802685fce40cf20057] | committer: Jason Garrett-Glaser

x86: Remove X264_CPU_SSE_MISALIGN functions

Prevents a crash if the misaligned exception mask bit is cleared for some reason.

Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule.
They also require modifying the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.

VEX-encoded instructions also supports unaligned memory operands. I tried adding AVX
implementations of all removed functions but there were no performance improvements on
Ivy Bridge. pixel_sad_x3 and pixel_sad_x4 had significant code size reductions though
so I kept them and added some minor cosmetics fixes and tweaks.

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=0c738e30ec025f0effdb62802685fce40cf20057
---

 common/cpu.c          |    7 -
 common/cpu.h          |    1 -
 common/pixel.c        |    8 +-
 common/x86/cpu-a.asm  |   11 --
 common/x86/mc-a.asm   |   71 +++----
 common/x86/mc-a2.asm  |    4 +-
 common/x86/mc-c.c     |   13 --
 common/x86/pixel.h    |    2 +-
 common/x86/sad-a.asm  |  487 +++++++++++++++++++++----------------------------
 common/x86/x86inc.asm |    9 +-
 encoder/encoder.c     |   21 +--
 encoder/lookahead.c   |    7 +-
 tools/checkasm.c      |    6 -
 x264.h                |   39 ++--
 14 files changed, 270 insertions(+), 416 deletions(-)

Diff:   http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=0c738e30ec025f0effdb62802685fce40cf20057


More information about the x264-devel mailing list