[x265] [PATCH] search: add lowres MV into search mv candidate list for search ME(CHANGESOUTPUT)

Deepthi Nandakumar deepthi at multicorewareinc.com
Mon May 18 15:32:37 CEST 2015


Gopu,

initSubCU accounts for the CU's partIdx while setting absPartIdx, that
means the expression in your patch should be changed to:

uint32_t block_x = cu.m_cuPelX;

Now, if you want to account for the PU offsets, you can do
uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.puAbsPartIdx];

Why do you need to add pu.width/2 ?

Thanks,
Deepthi


On Mon, May 18, 2015 at 1:53 PM, Gopu Govindaswamy <
gopu at multicorewareinc.com> wrote:

>
>
> On Thu, May 14, 2015 at 8:18 PM, Steve Borho <steve at borho.org> wrote:
>
>> On 05/14, Steve Borho wrote:
>> > On 05/14, Deepthi Nandakumar wrote:
>> > > Ran the smoke test on this, the results were mixed - on some
>> commandlines,
>> > > the encode efficiency benefits were really good though.
>> >
>> > the results I've seen show loss of effiency at slower presets, which is
>> > a real head-scratcher since it should help them most. Adding an
>> > additional motion candidate shouldn't reduce efficiency.
>> >
>> > I think I'd like to see a general solution for this (not just 16x16 CUs)
>> > before it gets pushed.  I think passing the PU to the function and
>> > sampling the lowres MV array at the PU center rather than the CU origin
>> > would be adequate (just add half PU width to block_x and half PU height
>> > to block_y).
>>
>> and add the PU absPartIdx to the CU absPartIdx so you get the correct
>> block starting position within the CTU.
>>
>>   uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.cuAbsPartIdx +
>> pu.puAbsPartIdx] + pu.width/2;
>>
>> also, please double-check that cu.m_cuPelX doesn't already include the
>> CU's absPartIdx within the CTU. If it does, then adding the CU part
>> offset again would be redundant (and might be why this isn't working as
>> well as it should).
>>
>
> the above modification i have done, and this new modification will works
> for all depths, i indexing the lowres MV
> like
>         uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.puAbsPartIdx +
> pu.cuAbsPartIdx] + pu.width/2;
>         uint32_t block_y = cu.m_cuPelY + g_zscanToPelY[pu.puAbsPartIdx +
> pu.cuAbsPartIdx] + pu.height/2;
>
>        uint32_t stride = m_frame->m_lowres.maxBlocksInRow;
>        uint32_t idx = ((block_y / 16) * stride) + (block_x / 16);
>        lmv = mv[idx];
>
> but still i would see the mixed results and also i have verified, the
> cu.m_cuPelX doesn't already include the
> CU's absPartIdx within the CTU but the CU's absPartIdx has been included
> into cu.m_cuPelX in initSubCU()
>
> here is the sample results:
>
> Adding lowresMV into MV candidate list
>
> FPS
>
>  Bitrate
>
>  Y PSNR
>
>  U PSNR
>
>  V PSNR
>
>  Global PSNR
>
>  SSIM
>
>  SSIM (dB)
>
> 72.98
>
> 2614.03
>
> 25.767
>
> 36.132
>
> 38.783
>
> 28.689
>
> 0.7565
>
> 6.135
>
> 67.85
>
> 2575.15
>
> 25.973
>
> 36.556
>
> 38.857
>
> 28.906
>
> 0.764529
>
> 6.281
>
> 48.24
>
> 2686.95
>
> 26.287
>
> 36.49
>
> 38.925
>
> 29.142
>
> 0.797709
>
> 6.94
>
> 31.54
>
> 2556.99
>
> 26.838
>
> 36.91
>
> 39.058
>
> 29.625
>
> 0.82559
>
> 7.584
>
> 22.27
>
> 2538.81
>
> 26.871
>
> 36.861
>
> 39.019
>
> 29.638
>
> 0.822282
>
> 7.503
>
> 8.16
>
> 2561.36
>
> 27.265
>
> 36.758
>
> 39.066
>
> 29.927
>
> 0.832109
>
> 7.75
>
> 1.76
>
> 2552.55
>
> 27.622
>
> 36.911
>
> 39.113
>
> 30.219
>
> 0.840274
>
> 7.966
>
>
>
> without adding lowresMV into MV candidate list
>
> FPS
>
>  Bitrate
>
>  Y PSNR
>
>  U PSNR
>
>  V PSNR
>
>  Global PSNR
>
>  SSIM
>
>  SSIM (dB)
>
> 85.39
>
> 2632.82
>
> 25.739
>
> 36.123
>
> 38.737
>
> 28.662
>
> 0.755912
>
> 6.125
>
> 63.88
>
> 2574.34
>
> 25.975
>
> 36.551
>
> 38.845
>
> 28.906
>
> 0.764624
>
> 6.282
>
> 43.79
>
> 2688
>
> 26.289
>
> 36.497
>
> 38.91
>
> 29.142
>
> 0.79745
>
> 6.935
>
> 29.78
>
> 2557.27
>
> 26.838
>
> 36.906
>
> 39.045
>
> 29.623
>
> 0.825511
>
> 7.582
>
> 21.64
>
> 2539.7
>
> 26.872
>
> 36.854
>
> 39.032
>
> 29.639
>
> 0.822437
>
> 7.506
>
> 8.2
>
> 2561.21
>
> 27.265
>
> 36.757
>
> 39.072
>
> 29.927
>
> 0.832237
>
> 7.753
>
> 1.73
>
> 2550.44
>
> 27.615
>
> 36.901
>
> 39.098
>
> 30.211
>
> 0.840277
>
> 7.966
>
> review this results and let me know so that i can send the final patch
>
> > > On Thu, May 14, 2015 at 10:53 AM, <gopu at multicorewareinc.com> wrote:
>> > >
>> > > > # HG changeset patch
>> > > > # User Gopu Govindaswamy <gopu at multicorewareinc.com>
>> > > > # Date 1431581025 -19800
>> > > > #      Thu May 14 10:53:45 2015 +0530
>> > > > # Node ID def132fbcf33352b18a31015dfefff79e95d21d7
>> > > > # Parent  479087422e29a672d6e9bc8d0cd2a65649d71fe2
>> > > > search: add lowres MV into search mv candidate list for search
>> > > > ME(CHANGESOUTPUT)
>> > > >
>> > > > Add one more mv (lowres MV) into MV candidates list and this extra
>> > > > candidates
>> > > > applicable only for depth 2, the lowres MV's are calculated 16x16
>> blocks
>> > > >
>> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.cpp
>> > > > --- a/source/encoder/search.cpp Wed May 13 16:52:59 2015 -0700
>> > > > +++ b/source/encoder/search.cpp Thu May 14 10:53:45 2015 +0530
>> > > > @@ -1930,9 +1930,9 @@
>> > > >      do
>> > > >      {
>> > > >          if (meId < m_slice->m_numRefIdx[0])
>> > > > -            slave.singleMotionEstimation(*this, pme.mode, pme.pu,
>> > > > pme.puIdx, 0, meId);
>> > > > +            slave.singleMotionEstimation(*this, pme.mode,
>> pme.cuGeom,
>> > > > pme.pu, pme.puIdx, 0, meId);
>> > > >          else
>> > > > -            slave.singleMotionEstimation(*this, pme.mode, pme.pu,
>> > > > pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
>> > > > +            slave.singleMotionEstimation(*this, pme.mode,
>> pme.cuGeom,
>> > > > pme.pu, pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
>> > > >
>> > > >          meId = -1;
>> > > >          pme.m_lock.acquire();
>> > > > @@ -1943,20 +1943,25 @@
>> > > >      while (meId >= 0);
>> > > >  }
>> > > >
>> > > > -void Search::singleMotionEstimation(Search& master, Mode&
>> interMode,
>> > > > const PredictionUnit& pu, int part, int list, int ref)
>> > > > +void Search::singleMotionEstimation(Search& master, Mode&
>> interMode,
>> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
>> int ref)
>> > > >  {
>> > > >      uint32_t bits = master.m_listSelBits[list] + MVP_IDX_BITS;
>> > > >      bits += getTUBits(ref, m_slice->m_numRefIdx[list]);
>> > > >
>> > > >      MotionData* bestME = interMode.bestME[part];
>> > > >
>> > > > -    MV  mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
>> > > > +    // 12 mv candidates including lowresMV
>> > > > +    MV  mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
>> > > >      int numMvc = interMode.cu.getPMV(interMode.interNeighbours,
>> list,
>> > > > ref, interMode.amvpCand[list][ref], mvc);
>> > > >
>> > > >      const MV* amvp = interMode.amvpCand[list][ref];
>> > > >      int mvpIdx = selectMVP(interMode.cu, pu, amvp, list, ref);
>> > > >      MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
>> > > >
>> > > > +    MV lmv = getLowresMV(interMode.cu, cuGeom, list, ref);
>> > > > +    if (lmv.notZero())
>> > > > +        mvc[numMvc++] = lmv;
>> > > > +
>> > > >      setSearchRange(interMode.cu, mvp, m_param->searchRange, mvmin,
>> mvmax);
>> > > >
>> > > >      int satdCost = m_me.motionEstimate(&m_slice->m_mref[list][ref],
>> > > > mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv);
>> > > > @@ -1990,7 +1995,8 @@
>> > > >      CUData& cu = interMode.cu;
>> > > >      Yuv* predYuv = &interMode.predYuv;
>> > > >
>> > > > -    MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
>> > > > +    // 12 mv candidates including lowresMV
>> > > > +    MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
>> > > >
>> > > >      const Slice *slice = m_slice;
>> > > >      int numPart     = cu.getNumPartInter();
>> > > > @@ -2039,6 +2045,10 @@
>> > > >                  int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
>> > > >                  MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
>> > > >
>> > > > +                MV lmv = getLowresMV(cu, cuGeom, list, ref);
>> > > > +                if (lmv.notZero())
>> > > > +                    mvc[numMvc++] = lmv;
>> > > > +
>> > > >                  setSearchRange(cu, mvp, m_param->searchRange,
>> mvmin,
>> > > > mvmax);
>> > > >                  int satdCost =
>> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
>> numMvc,
>> > > > mvc, m_param->searchRange, outmv);
>> > > >
>> > > > @@ -2070,7 +2080,7 @@
>> > > >              {
>> > > >                  processPME(pme, *this);
>> > > >
>> > > > -                singleMotionEstimation(*this, interMode, pu,
>> puIdx, 0,
>> > > > 0); /* L0-0 */
>> > > > +                singleMotionEstimation(*this, interMode, cuGeom,
>> pu,
>> > > > puIdx, 0, 0); /* L0-0 */
>> > > >
>> > > >                  bDoUnidir = false;
>> > > >
>> > > > @@ -2096,6 +2106,10 @@
>> > > >                      int mvpIdx = selectMVP(cu, pu, amvp, list,
>> ref);
>> > > >                      MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
>> > > >
>> > > > +                    MV lmv = getLowresMV(cu, cuGeom, list, ref);
>> > > > +                    if (lmv.notZero())
>> > > > +                        mvc[numMvc++] = lmv;
>> > > > +
>> > > >                      setSearchRange(cu, mvp, m_param->searchRange,
>> mvmin,
>> > > > mvmax);
>> > > >                      int satdCost =
>> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
>> numMvc,
>> > > > mvc, m_param->searchRange, outmv);
>> > > >
>> > > > @@ -3444,3 +3458,31 @@
>> > > >              cu.setQPSubParts(cu.getRefQP(0), 0, cuGeom.depth);
>> > > >      }
>> > > >  }
>> > > > +
>> > > > +MV Search::getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
>> list,
>> > > > int ref)
>> > > > +{
>> > > > +    MV lmv = 0;
>> > > > +    if (g_maxCUSize >> cuGeom.depth == 16)
>> > > > +    {
>> > > > +        int curPoc = m_slice->m_poc;
>> > > > +        int refPoc = m_slice->m_refPicList[list][ref]->m_poc;
>> > > > +        int diffPoc = abs(curPoc - refPoc);
>> > > > +
>> > > > +        if (diffPoc <= m_param->bframes + 1)
>> > > > +        {
>> > > > +            MV *mv = m_frame->m_lowres.lowresMvs[list][diffPoc -
>> 1];
>> > > > +            uint32_t block_x = cu.m_cuPelX +
>> > > > g_zscanToPelX[cuGeom.absPartIdx];
>> > > > +            uint32_t block_y = cu.m_cuPelY +
>> > > > g_zscanToPelY[cuGeom.absPartIdx];
>> > > > +
>> > > > +            /* number of blocks per row in lowres*/
>> > > > +            uint32_t stride = ((m_param->sourceWidth / 2) +
>> > > > X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
>> > > > +
>> > > > +            uint32_t idx = ((block_y / 16) * stride) + (block_x /
>> 16);
>> > > > +            /* check whether this motion search was performed by
>> > > > lookahead */
>> > > > +            if (mv[0].x != 0x7FFF)
>> > > > +                lmv = mv[idx];
>> > > >
>> > >
>> > > The only change I made was to move this check up.
>> > >
>> > > +        }
>> > > > +    }
>> > > > +
>> > > > +    return lmv;
>> > > > +}
>> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.h
>> > > > --- a/source/encoder/search.h   Wed May 13 16:52:59 2015 -0700
>> > > > +++ b/source/encoder/search.h   Thu May 14 10:53:45 2015 +0530
>> > > > @@ -319,6 +319,8 @@
>> > > >      void checkDQP(Mode& mode, const CUGeom& cuGeom);
>> > > >      void checkDQPForSplitPred(Mode& mode, const CUGeom& cuGeom);
>> > > >
>> > > > +    MV getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
>> list, int
>> > > > ref);
>> > > > +
>> > > >      class PME : public BondedTaskGroup
>> > > >      {
>> > > >      public:
>> > > > @@ -339,7 +341,7 @@
>> > > >      };
>> > > >
>> > > >      void     processPME(PME& pme, Search& slave);
>> > > > -    void     singleMotionEstimation(Search& master, Mode&
>> interMode,
>> > > > const PredictionUnit& pu, int part, int list, int ref);
>> > > > +    void     singleMotionEstimation(Search& master, Mode&
>> interMode,
>> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
>> int
>> > > > ref);
>> > > >
>> > > >  protected:
>> > > >
>> > > > _______________________________________________
>> > > > x265-devel mailing list
>> > > > x265-devel at videolan.org
>> > > > https://mailman.videolan.org/listinfo/x265-devel
>> > > >
>> >
>> > > _______________________________________________
>> > > x265-devel mailing list
>> > > x265-devel at videolan.org
>> > > https://mailman.videolan.org/listinfo/x265-devel
>> >
>> >
>> > --
>> > Steve Borho
>>
>> --
>> Steve Borho
>> _______________________________________________
>> x265-devel mailing list
>> x265-devel at videolan.org
>> https://mailman.videolan.org/listinfo/x265-devel
>>
>
>
>
> --
> Thanks & Regards
> Gopu G
> Multicoreware Inc
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150518/ef2f9a53/attachment-0001.html>


More information about the x265-devel mailing list