[x264-devel] quant_4x4x4: quant one 8x8 block at a time
Jason Garrett-Glaser
git at videolan.org
Wed Feb 27 00:18:06 CET 2013
x264 | branch: master | Jason Garrett-Glaser <jason at x264.com> | Fri Feb 8 15:34:38 2013 -0800| [253e2c3f7eab79d74450de4f88a8bf451fd01be4] | committer: Jason Garrett-Glaser
quant_4x4x4: quant one 8x8 block at a time
This reduces overhead and lets us use less branchy code for zigzag, dequant,
decimate, and so on.
Reorganize and optimize a lot of macroblock_encode using this new function.
~1-2% faster overall.
Includes NEON and x86 versions of the new function.
Using larger merged functions like this will also make wider SIMD, like
AVX2, more effective.
> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=253e2c3f7eab79d74450de4f88a8bf451fd01be4
---
common/arm/quant-a.S | 51 ++++++-
common/arm/quant.h | 1 +
common/osdep.h | 7 +
common/quant.c | 21 +++
common/quant.h | 5 +-
common/x86/quant-a.asm | 95 ++++++++++--
common/x86/quant.h | 4 +
encoder/macroblock.c | 375 ++++++++++++++++++++++++++++--------------------
encoder/macroblock.h | 4 +
encoder/rdo.c | 1 +
tools/checkasm.c | 42 +++---
11 files changed, 409 insertions(+), 197 deletions(-)
Diff: http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=253e2c3f7eab79d74450de4f88a8bf451fd01be4
More information about the x264-devel
mailing list