[x265] [PATCH] search: add lowres MV into search mv candidate list for search ME(CHANGESOUTPUT)

Gopu Govindaswamy gopu at multicorewareinc.com
Mon May 18 10:23:52 CEST 2015


On Thu, May 14, 2015 at 8:18 PM, Steve Borho <steve at borho.org> wrote:

> On 05/14, Steve Borho wrote:
> > On 05/14, Deepthi Nandakumar wrote:
> > > Ran the smoke test on this, the results were mixed - on some
> commandlines,
> > > the encode efficiency benefits were really good though.
> >
> > the results I've seen show loss of effiency at slower presets, which is
> > a real head-scratcher since it should help them most. Adding an
> > additional motion candidate shouldn't reduce efficiency.
> >
> > I think I'd like to see a general solution for this (not just 16x16 CUs)
> > before it gets pushed.  I think passing the PU to the function and
> > sampling the lowres MV array at the PU center rather than the CU origin
> > would be adequate (just add half PU width to block_x and half PU height
> > to block_y).
>
> and add the PU absPartIdx to the CU absPartIdx so you get the correct
> block starting position within the CTU.
>
>   uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.cuAbsPartIdx +
> pu.puAbsPartIdx] + pu.width/2;
>
> also, please double-check that cu.m_cuPelX doesn't already include the
> CU's absPartIdx within the CTU. If it does, then adding the CU part
> offset again would be redundant (and might be why this isn't working as
> well as it should).
>

the above modification i have done, and this new modification will works
for all depths, i indexing the lowres MV
like
        uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.puAbsPartIdx +
pu.cuAbsPartIdx] + pu.width/2;
        uint32_t block_y = cu.m_cuPelY + g_zscanToPelY[pu.puAbsPartIdx +
pu.cuAbsPartIdx] + pu.height/2;

       uint32_t stride = m_frame->m_lowres.maxBlocksInRow;
       uint32_t idx = ((block_y / 16) * stride) + (block_x / 16);
       lmv = mv[idx];

but still i would see the mixed results and also i have verified, the
cu.m_cuPelX doesn't already include the
CU's absPartIdx within the CTU but the CU's absPartIdx has been included
into cu.m_cuPelX in initSubCU()

here is the sample results:

Adding lowresMV into MV candidate list

FPS

 Bitrate

 Y PSNR

 U PSNR

 V PSNR

 Global PSNR

 SSIM

 SSIM (dB)

72.98

2614.03

25.767

36.132

38.783

28.689

0.7565

6.135

67.85

2575.15

25.973

36.556

38.857

28.906

0.764529

6.281

48.24

2686.95

26.287

36.49

38.925

29.142

0.797709

6.94

31.54

2556.99

26.838

36.91

39.058

29.625

0.82559

7.584

22.27

2538.81

26.871

36.861

39.019

29.638

0.822282

7.503

8.16

2561.36

27.265

36.758

39.066

29.927

0.832109

7.75

1.76

2552.55

27.622

36.911

39.113

30.219

0.840274

7.966



without adding lowresMV into MV candidate list

FPS

 Bitrate

 Y PSNR

 U PSNR

 V PSNR

 Global PSNR

 SSIM

 SSIM (dB)

85.39

2632.82

25.739

36.123

38.737

28.662

0.755912

6.125

63.88

2574.34

25.975

36.551

38.845

28.906

0.764624

6.282

43.79

2688

26.289

36.497

38.91

29.142

0.79745

6.935

29.78

2557.27

26.838

36.906

39.045

29.623

0.825511

7.582

21.64

2539.7

26.872

36.854

39.032

29.639

0.822437

7.506

8.2

2561.21

27.265

36.757

39.072

29.927

0.832237

7.753

1.73

2550.44

27.615

36.901

39.098

30.211

0.840277

7.966

review this results and let me know so that i can send the final patch

> > On Thu, May 14, 2015 at 10:53 AM, <gopu at multicorewareinc.com> wrote:
> > >
> > > > # HG changeset patch
> > > > # User Gopu Govindaswamy <gopu at multicorewareinc.com>
> > > > # Date 1431581025 -19800
> > > > #      Thu May 14 10:53:45 2015 +0530
> > > > # Node ID def132fbcf33352b18a31015dfefff79e95d21d7
> > > > # Parent  479087422e29a672d6e9bc8d0cd2a65649d71fe2
> > > > search: add lowres MV into search mv candidate list for search
> > > > ME(CHANGESOUTPUT)
> > > >
> > > > Add one more mv (lowres MV) into MV candidates list and this extra
> > > > candidates
> > > > applicable only for depth 2, the lowres MV's are calculated 16x16
> blocks
> > > >
> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.cpp
> > > > --- a/source/encoder/search.cpp Wed May 13 16:52:59 2015 -0700
> > > > +++ b/source/encoder/search.cpp Thu May 14 10:53:45 2015 +0530
> > > > @@ -1930,9 +1930,9 @@
> > > >      do
> > > >      {
> > > >          if (meId < m_slice->m_numRefIdx[0])
> > > > -            slave.singleMotionEstimation(*this, pme.mode, pme.pu,
> > > > pme.puIdx, 0, meId);
> > > > +            slave.singleMotionEstimation(*this, pme.mode,
> pme.cuGeom,
> > > > pme.pu, pme.puIdx, 0, meId);
> > > >          else
> > > > -            slave.singleMotionEstimation(*this, pme.mode, pme.pu,
> > > > pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
> > > > +            slave.singleMotionEstimation(*this, pme.mode,
> pme.cuGeom,
> > > > pme.pu, pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
> > > >
> > > >          meId = -1;
> > > >          pme.m_lock.acquire();
> > > > @@ -1943,20 +1943,25 @@
> > > >      while (meId >= 0);
> > > >  }
> > > >
> > > > -void Search::singleMotionEstimation(Search& master, Mode& interMode,
> > > > const PredictionUnit& pu, int part, int list, int ref)
> > > > +void Search::singleMotionEstimation(Search& master, Mode& interMode,
> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
> int ref)
> > > >  {
> > > >      uint32_t bits = master.m_listSelBits[list] + MVP_IDX_BITS;
> > > >      bits += getTUBits(ref, m_slice->m_numRefIdx[list]);
> > > >
> > > >      MotionData* bestME = interMode.bestME[part];
> > > >
> > > > -    MV  mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
> > > > +    // 12 mv candidates including lowresMV
> > > > +    MV  mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
> > > >      int numMvc = interMode.cu.getPMV(interMode.interNeighbours,
> list,
> > > > ref, interMode.amvpCand[list][ref], mvc);
> > > >
> > > >      const MV* amvp = interMode.amvpCand[list][ref];
> > > >      int mvpIdx = selectMVP(interMode.cu, pu, amvp, list, ref);
> > > >      MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > +    MV lmv = getLowresMV(interMode.cu, cuGeom, list, ref);
> > > > +    if (lmv.notZero())
> > > > +        mvc[numMvc++] = lmv;
> > > > +
> > > >      setSearchRange(interMode.cu, mvp, m_param->searchRange, mvmin,
> mvmax);
> > > >
> > > >      int satdCost = m_me.motionEstimate(&m_slice->m_mref[list][ref],
> > > > mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv);
> > > > @@ -1990,7 +1995,8 @@
> > > >      CUData& cu = interMode.cu;
> > > >      Yuv* predYuv = &interMode.predYuv;
> > > >
> > > > -    MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
> > > > +    // 12 mv candidates including lowresMV
> > > > +    MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
> > > >
> > > >      const Slice *slice = m_slice;
> > > >      int numPart     = cu.getNumPartInter();
> > > > @@ -2039,6 +2045,10 @@
> > > >                  int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
> > > >                  MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > +                MV lmv = getLowresMV(cu, cuGeom, list, ref);
> > > > +                if (lmv.notZero())
> > > > +                    mvc[numMvc++] = lmv;
> > > > +
> > > >                  setSearchRange(cu, mvp, m_param->searchRange, mvmin,
> > > > mvmax);
> > > >                  int satdCost =
> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
> numMvc,
> > > > mvc, m_param->searchRange, outmv);
> > > >
> > > > @@ -2070,7 +2080,7 @@
> > > >              {
> > > >                  processPME(pme, *this);
> > > >
> > > > -                singleMotionEstimation(*this, interMode, pu, puIdx,
> 0,
> > > > 0); /* L0-0 */
> > > > +                singleMotionEstimation(*this, interMode, cuGeom, pu,
> > > > puIdx, 0, 0); /* L0-0 */
> > > >
> > > >                  bDoUnidir = false;
> > > >
> > > > @@ -2096,6 +2106,10 @@
> > > >                      int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
> > > >                      MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > +                    MV lmv = getLowresMV(cu, cuGeom, list, ref);
> > > > +                    if (lmv.notZero())
> > > > +                        mvc[numMvc++] = lmv;
> > > > +
> > > >                      setSearchRange(cu, mvp, m_param->searchRange,
> mvmin,
> > > > mvmax);
> > > >                      int satdCost =
> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
> numMvc,
> > > > mvc, m_param->searchRange, outmv);
> > > >
> > > > @@ -3444,3 +3458,31 @@
> > > >              cu.setQPSubParts(cu.getRefQP(0), 0, cuGeom.depth);
> > > >      }
> > > >  }
> > > > +
> > > > +MV Search::getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
> list,
> > > > int ref)
> > > > +{
> > > > +    MV lmv = 0;
> > > > +    if (g_maxCUSize >> cuGeom.depth == 16)
> > > > +    {
> > > > +        int curPoc = m_slice->m_poc;
> > > > +        int refPoc = m_slice->m_refPicList[list][ref]->m_poc;
> > > > +        int diffPoc = abs(curPoc - refPoc);
> > > > +
> > > > +        if (diffPoc <= m_param->bframes + 1)
> > > > +        {
> > > > +            MV *mv = m_frame->m_lowres.lowresMvs[list][diffPoc - 1];
> > > > +            uint32_t block_x = cu.m_cuPelX +
> > > > g_zscanToPelX[cuGeom.absPartIdx];
> > > > +            uint32_t block_y = cu.m_cuPelY +
> > > > g_zscanToPelY[cuGeom.absPartIdx];
> > > > +
> > > > +            /* number of blocks per row in lowres*/
> > > > +            uint32_t stride = ((m_param->sourceWidth / 2) +
> > > > X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
> > > > +
> > > > +            uint32_t idx = ((block_y / 16) * stride) + (block_x /
> 16);
> > > > +            /* check whether this motion search was performed by
> > > > lookahead */
> > > > +            if (mv[0].x != 0x7FFF)
> > > > +                lmv = mv[idx];
> > > >
> > >
> > > The only change I made was to move this check up.
> > >
> > > +        }
> > > > +    }
> > > > +
> > > > +    return lmv;
> > > > +}
> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.h
> > > > --- a/source/encoder/search.h   Wed May 13 16:52:59 2015 -0700
> > > > +++ b/source/encoder/search.h   Thu May 14 10:53:45 2015 +0530
> > > > @@ -319,6 +319,8 @@
> > > >      void checkDQP(Mode& mode, const CUGeom& cuGeom);
> > > >      void checkDQPForSplitPred(Mode& mode, const CUGeom& cuGeom);
> > > >
> > > > +    MV getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
> list, int
> > > > ref);
> > > > +
> > > >      class PME : public BondedTaskGroup
> > > >      {
> > > >      public:
> > > > @@ -339,7 +341,7 @@
> > > >      };
> > > >
> > > >      void     processPME(PME& pme, Search& slave);
> > > > -    void     singleMotionEstimation(Search& master, Mode& interMode,
> > > > const PredictionUnit& pu, int part, int list, int ref);
> > > > +    void     singleMotionEstimation(Search& master, Mode& interMode,
> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
> int
> > > > ref);
> > > >
> > > >  protected:
> > > >
> > > > _______________________________________________
> > > > x265-devel mailing list
> > > > x265-devel at videolan.org
> > > > https://mailman.videolan.org/listinfo/x265-devel
> > > >
> >
> > > _______________________________________________
> > > x265-devel mailing list
> > > x265-devel at videolan.org
> > > https://mailman.videolan.org/listinfo/x265-devel
> >
> >
> > --
> > Steve Borho
>
> --
> Steve Borho
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>



-- 
Thanks & Regards
Gopu G
Multicoreware Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150518/4a5b0a63/attachment-0001.html>


More information about the x265-devel mailing list