[x265] [PATCH RFC] analysis: use macro and for-loop to simplify fast-intra

Steve Borho steve at borho.org
Mon Aug 18 19:04:36 CEST 2014


On 08/17, dave wrote:
> On 08/14/2014 09:10 PM, Steve Borho wrote:
> >On 08/14, dave wrote:
> >>On 08/14/2014 05:02 PM, Steve Borho wrote:
> >>>On 08/14, dave wrote:
> >>>>On 08/14/2014 01:42 PM, Steve Borho wrote:
> >>>>># HG changeset patch
> >>>>># User Steve Borho <steve at borho.org>
> >>>>># Date 1408048681 18000
> >>>>>#      Thu Aug 14 15:38:01 2014 -0500
> >>>>># Node ID 07138e6ac952c96d1e31f5490c44f4cfaf6ac12a
> >>>>># Parent  213f17c1492c5bf96c3f382e7beffe0c871a563c
> >>>>>analysis: use macro and for-loop to simplify fast-intra
> >>>>>
> >>>>>this changes behavior a bit; it's trying both +/-1 offsets instead of just
> >>>>>one. and it has to do one extra check at the end since mode 34 isn't reached
> >>>>>by the other previous loops
> >>>>>
> >>>>>diff -r 213f17c1492c -r 07138e6ac952 source/encoder/analysis.cpp
> >>>>>--- a/source/encoder/analysis.cpp	Thu Aug 14 09:43:39 2014 -0700
> >>>>>+++ b/source/encoder/analysis.cpp	Thu Aug 14 15:38:01 2014 -0500
> >>>>>@@ -1693,68 +1693,56 @@
> >>>>>      bool modeHor;
> >>>>>      pixel *cmp;
> >>>>>      intptr_t srcStride;
> >>>>>+
> >>>>>+#define TRY_ANGLE(angle) \
> >>>>>+    modeHor = angle < 18; \
> >>>>>+    cmp = modeHor ? buf_trans : fenc; \
> >>>>>+    srcStride = modeHor ? scaleTuSize : scaleStride; \
> >>>>>+    sad = sa8d(cmp, srcStride, &tmp[(angle - 2) * predsize], scaleTuSize) << costShift; \
> >>>>>+    bits = (mpms & ((uint64_t)1 << angle)) ? xModeBitsIntra(cu, angle, partOffset, depth) : rbits; \
> >>>>>+    cost = m_rdCost.calcRdSADCost(sad, bits)
> >>>>>+
> >>>>>      if (m_param->bEnableFastIntra)
> >>>>>      {
> >>>>>-        int lowsad, highsad, asad = 0;
> >>>>>-        uint32_t lowbits, highbits, amode, lowmode, highmode, abits = 0;
> >>>>>-        uint64_t lowcost, highcost = MAX_INT64, acost = MAX_INT64;
> >>>>>+        int asad = 0;
> >>>>>+        uint32_t lowmode, highmode, amode, abits = 0;
> >>>>>+        uint64_t acost = MAX_INT64;
> >>>>>-        for (mode = 4;mode < 35; mode += 5)
> >>>>>+        /* pick the best angle, sampling at distance of 5 */
> >>>>>+        for (mode = 5; mode < 35; mode += 5)
> >>>Thanks for reviewing
> >>>
> >>>>By starting with mode = 5, won't this miss mode 2 since only +/-2 is
> >>>>checked?  By starting from 4 the loop should end at 34.
> >>>if 5 was the best angle of the initial sweep, we'll try +/- 2 (3 and
> >>>7). If 3 is the new best we try +/-1 which would be 2 and 4.
> >>>
> >>>On the high end of the spectrum; if 30 was the best cost, it will try
> >>>28 and 32, then 33 and 31.
> >>>
> >>>Starting with 4 would remove the need for the extra check at the end,
> >>>but at the same time we would need to range-check the low/high modes as
> >>>well, since it could reach mode 1 (planar) or modes above 34.
> >>I understand now.  I am testing the original search method in
> >>TEncSearch::estIntraPredQT.  Would you prefer the new one there too?
> >I was just looking at it. I think the angular mode scan is the least of
> >its problems.
> >
> >Currently it calculates the sa8d cost of all 35 modes and builds a
> >sorted list of the top N; where N is 3 except for block sizes 4x4 and
> >8x8, where N is 8.  Next it does full encodes of each of the top N and
> >then picks the one with the least RD cost.
> >
> >What I would prefer for it to do would be to calculate all the sa8d
> >costs, keeping track of the best cost. Then in the second pass measure
> >the RDO cost of all modes that are within X% of the best cost or the
> >most-probable-mode. So the threshold is the sa8d cost delta instead of
> >an arbitrary count.
> >
> >Then see my TODO comment about redundant work; the function needs some
> >serious attention in the RDO section.

> Do you want X% to be a cli option?  Maybe with some preset defaults?

I'm not sure if it should be a CLI option, but we could possibly make it
a function of rd-level and depth.

-- 
Steve Borho


More information about the x265-devel mailing list