[x264-devel] CABAC trellis opts part 2: C optimizations
Loren Merritt
git at videolan.org
Mon Jan 16 02:12:00 CET 2012
x264 | branch: master | Loren Merritt <pengvado at akuvian.org> | Thu Dec 22 17:56:06 2011 +0000| [65bd12ae875a768a06b67ec6297dec18323e0768] | committer: Jason Garrett-Glaser
CABAC trellis opts part 2: C optimizations
Hoist the branch on coef value out of the loop over node contexts.
Special cases for each possible coef value (0,1,n).
Special case for dc-only blocks.
Template the main loop for two common subsets of nodes, to avoid a bunch of branches about which nodes are live.
Use the nonupdating version of cabac_size_decision in more cases, and omit those bins from the node struct.
CABAC offsets are now compile-time constants.
Change TRELLIS_SCORE_MAX from a specific constant to anything negative, which is cheaper to test.
Remove dct_weight2_zigzag[], since trellis has to lookup zigzag[] anyway.
60% faster on x86_64.
25k->18k codesize.
> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=65bd12ae875a768a06b67ec6297dec18323e0768
---
common/dct.c | 14 --
common/dct.h | 3 -
encoder/encoder.c | 1 -
encoder/rdo.c | 581 ++++++++++++++++++++++++++++++++++-------------------
4 files changed, 379 insertions(+), 220 deletions(-)
Diff: http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=65bd12ae875a768a06b67ec6297dec18323e0768
More information about the x264-devel
mailing list