[x264-devel] CABAC trellis opts part 2: C optimizations

Loren Merritt git at videolan.org
Mon Jan 16 02:12:00 CET 2012


x264 | branch: master | Loren Merritt <pengvado at akuvian.org> | Thu Dec 22 17:56:06 2011 +0000| [65bd12ae875a768a06b67ec6297dec18323e0768] | committer: Jason Garrett-Glaser

CABAC trellis opts part 2: C optimizations

Hoist the branch on coef value out of the loop over node contexts.
Special cases for each possible coef value (0,1,n).
Special case for dc-only blocks.
Template the main loop for two common subsets of nodes, to avoid a bunch of branches about which nodes are live.
Use the nonupdating version of cabac_size_decision in more cases, and omit those bins from the node struct.
CABAC offsets are now compile-time constants.
Change TRELLIS_SCORE_MAX from a specific constant to anything negative, which is cheaper to test.
Remove dct_weight2_zigzag[], since trellis has to lookup zigzag[] anyway.

60% faster on x86_64.
25k->18k codesize.

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=65bd12ae875a768a06b67ec6297dec18323e0768
---

 common/dct.c      |   14 --
 common/dct.h      |    3 -
 encoder/encoder.c |    1 -
 encoder/rdo.c     |  581 ++++++++++++++++++++++++++++++++++-------------------
 4 files changed, 379 insertions(+), 220 deletions(-)

Diff:   http://git.videolan.org/gitweb.cgi/x264.git/?a=commitdiff;h=65bd12ae875a768a06b67ec6297dec18323e0768


More information about the x264-devel mailing list