[x264-devel] quant_4x4x4: quant one 8x8 block at a time

Wed Feb 27 00:18:06 CET 2013

x264 | branch: master | Jason Garrett-Glaser <jason at x264.com> | Fri Feb  8 15:34:38 2013 -0800| [253e2c3f7eab79d74450de4f88a8bf451fd01be4] | committer: Jason Garrett-Glaser

quant_4x4x4: quant one 8x8 block at a time

This reduces overhead and lets us use less branchy code for zigzag, dequant,
decimate, and so on.
Reorganize and optimize a lot of macroblock_encode using this new function.
~1-2% faster overall.

Includes NEON and x86 versions of the new function.
Using larger merged functions like this will also make wider SIMD, like
AVX2, more effective.

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=253e2c3f7eab79d74450de4f88a8bf451fd01be4
---

 common/arm/quant-a.S   |   51 ++++++-
 common/arm/quant.h     |    1 +
 common/osdep.h         |    7 +
 common/quant.c         |   21 +++
 common/quant.h         |    5 +-
 common/x86/quant-a.asm |   95 ++++++++++--
 common/x86/quant.h     |    4 +
 encoder/macroblock.c   |  375 ++++++++++++++++++++++++++++--------------------
 encoder/macroblock.h   |    4 +
 encoder/rdo.c          |    1 +
 tools/checkasm.c       |   42 +++---
 11 files changed, 409 insertions(+), 197 deletions(-)

Diff:   http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=253e2c3f7eab79d74450de4f88a8bf451fd01be4