[x265] [PATCH] search: add lowres MV into search mv candidate list for search ME(CHANGESOUTPUT)
Gopu Govindaswamy
gopu at multicorewareinc.com
Mon May 18 10:23:52 CEST 2015
On Thu, May 14, 2015 at 8:18 PM, Steve Borho <steve at borho.org> wrote:
> On 05/14, Steve Borho wrote:
> > On 05/14, Deepthi Nandakumar wrote:
> > > Ran the smoke test on this, the results were mixed - on some
> commandlines,
> > > the encode efficiency benefits were really good though.
> >
> > the results I've seen show loss of effiency at slower presets, which is
> > a real head-scratcher since it should help them most. Adding an
> > additional motion candidate shouldn't reduce efficiency.
> >
> > I think I'd like to see a general solution for this (not just 16x16 CUs)
> > before it gets pushed. I think passing the PU to the function and
> > sampling the lowres MV array at the PU center rather than the CU origin
> > would be adequate (just add half PU width to block_x and half PU height
> > to block_y).
>
> and add the PU absPartIdx to the CU absPartIdx so you get the correct
> block starting position within the CTU.
>
> uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.cuAbsPartIdx +
> pu.puAbsPartIdx] + pu.width/2;
>
> also, please double-check that cu.m_cuPelX doesn't already include the
> CU's absPartIdx within the CTU. If it does, then adding the CU part
> offset again would be redundant (and might be why this isn't working as
> well as it should).
>
the above modification i have done, and this new modification will works
for all depths, i indexing the lowres MV
like
uint32_t block_x = cu.m_cuPelX + g_zscanToPelX[pu.puAbsPartIdx +
pu.cuAbsPartIdx] + pu.width/2;
uint32_t block_y = cu.m_cuPelY + g_zscanToPelY[pu.puAbsPartIdx +
pu.cuAbsPartIdx] + pu.height/2;
uint32_t stride = m_frame->m_lowres.maxBlocksInRow;
uint32_t idx = ((block_y / 16) * stride) + (block_x / 16);
lmv = mv[idx];
but still i would see the mixed results and also i have verified, the
cu.m_cuPelX doesn't already include the
CU's absPartIdx within the CTU but the CU's absPartIdx has been included
into cu.m_cuPelX in initSubCU()
here is the sample results:
Adding lowresMV into MV candidate list
FPS
Bitrate
Y PSNR
U PSNR
V PSNR
Global PSNR
SSIM
SSIM (dB)
72.98
2614.03
25.767
36.132
38.783
28.689
0.7565
6.135
67.85
2575.15
25.973
36.556
38.857
28.906
0.764529
6.281
48.24
2686.95
26.287
36.49
38.925
29.142
0.797709
6.94
31.54
2556.99
26.838
36.91
39.058
29.625
0.82559
7.584
22.27
2538.81
26.871
36.861
39.019
29.638
0.822282
7.503
8.16
2561.36
27.265
36.758
39.066
29.927
0.832109
7.75
1.76
2552.55
27.622
36.911
39.113
30.219
0.840274
7.966
without adding lowresMV into MV candidate list
FPS
Bitrate
Y PSNR
U PSNR
V PSNR
Global PSNR
SSIM
SSIM (dB)
85.39
2632.82
25.739
36.123
38.737
28.662
0.755912
6.125
63.88
2574.34
25.975
36.551
38.845
28.906
0.764624
6.282
43.79
2688
26.289
36.497
38.91
29.142
0.79745
6.935
29.78
2557.27
26.838
36.906
39.045
29.623
0.825511
7.582
21.64
2539.7
26.872
36.854
39.032
29.639
0.822437
7.506
8.2
2561.21
27.265
36.757
39.072
29.927
0.832237
7.753
1.73
2550.44
27.615
36.901
39.098
30.211
0.840277
7.966
review this results and let me know so that i can send the final patch
> > On Thu, May 14, 2015 at 10:53 AM, <gopu at multicorewareinc.com> wrote:
> > >
> > > > # HG changeset patch
> > > > # User Gopu Govindaswamy <gopu at multicorewareinc.com>
> > > > # Date 1431581025 -19800
> > > > # Thu May 14 10:53:45 2015 +0530
> > > > # Node ID def132fbcf33352b18a31015dfefff79e95d21d7
> > > > # Parent 479087422e29a672d6e9bc8d0cd2a65649d71fe2
> > > > search: add lowres MV into search mv candidate list for search
> > > > ME(CHANGESOUTPUT)
> > > >
> > > > Add one more mv (lowres MV) into MV candidates list and this extra
> > > > candidates
> > > > applicable only for depth 2, the lowres MV's are calculated 16x16
> blocks
> > > >
> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.cpp
> > > > --- a/source/encoder/search.cpp Wed May 13 16:52:59 2015 -0700
> > > > +++ b/source/encoder/search.cpp Thu May 14 10:53:45 2015 +0530
> > > > @@ -1930,9 +1930,9 @@
> > > > do
> > > > {
> > > > if (meId < m_slice->m_numRefIdx[0])
> > > > - slave.singleMotionEstimation(*this, pme.mode, pme.pu,
> > > > pme.puIdx, 0, meId);
> > > > + slave.singleMotionEstimation(*this, pme.mode,
> pme.cuGeom,
> > > > pme.pu, pme.puIdx, 0, meId);
> > > > else
> > > > - slave.singleMotionEstimation(*this, pme.mode, pme.pu,
> > > > pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
> > > > + slave.singleMotionEstimation(*this, pme.mode,
> pme.cuGeom,
> > > > pme.pu, pme.puIdx, 1, meId - m_slice->m_numRefIdx[0]);
> > > >
> > > > meId = -1;
> > > > pme.m_lock.acquire();
> > > > @@ -1943,20 +1943,25 @@
> > > > while (meId >= 0);
> > > > }
> > > >
> > > > -void Search::singleMotionEstimation(Search& master, Mode& interMode,
> > > > const PredictionUnit& pu, int part, int list, int ref)
> > > > +void Search::singleMotionEstimation(Search& master, Mode& interMode,
> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
> int ref)
> > > > {
> > > > uint32_t bits = master.m_listSelBits[list] + MVP_IDX_BITS;
> > > > bits += getTUBits(ref, m_slice->m_numRefIdx[list]);
> > > >
> > > > MotionData* bestME = interMode.bestME[part];
> > > >
> > > > - MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
> > > > + // 12 mv candidates including lowresMV
> > > > + MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
> > > > int numMvc = interMode.cu.getPMV(interMode.interNeighbours,
> list,
> > > > ref, interMode.amvpCand[list][ref], mvc);
> > > >
> > > > const MV* amvp = interMode.amvpCand[list][ref];
> > > > int mvpIdx = selectMVP(interMode.cu, pu, amvp, list, ref);
> > > > MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > + MV lmv = getLowresMV(interMode.cu, cuGeom, list, ref);
> > > > + if (lmv.notZero())
> > > > + mvc[numMvc++] = lmv;
> > > > +
> > > > setSearchRange(interMode.cu, mvp, m_param->searchRange, mvmin,
> mvmax);
> > > >
> > > > int satdCost = m_me.motionEstimate(&m_slice->m_mref[list][ref],
> > > > mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv);
> > > > @@ -1990,7 +1995,8 @@
> > > > CUData& cu = interMode.cu;
> > > > Yuv* predYuv = &interMode.predYuv;
> > > >
> > > > - MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 1];
> > > > + // 12 mv candidates including lowresMV
> > > > + MV mvc[(MD_ABOVE_LEFT + 1) * 2 + 2];
> > > >
> > > > const Slice *slice = m_slice;
> > > > int numPart = cu.getNumPartInter();
> > > > @@ -2039,6 +2045,10 @@
> > > > int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
> > > > MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > + MV lmv = getLowresMV(cu, cuGeom, list, ref);
> > > > + if (lmv.notZero())
> > > > + mvc[numMvc++] = lmv;
> > > > +
> > > > setSearchRange(cu, mvp, m_param->searchRange, mvmin,
> > > > mvmax);
> > > > int satdCost =
> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
> numMvc,
> > > > mvc, m_param->searchRange, outmv);
> > > >
> > > > @@ -2070,7 +2080,7 @@
> > > > {
> > > > processPME(pme, *this);
> > > >
> > > > - singleMotionEstimation(*this, interMode, pu, puIdx,
> 0,
> > > > 0); /* L0-0 */
> > > > + singleMotionEstimation(*this, interMode, cuGeom, pu,
> > > > puIdx, 0, 0); /* L0-0 */
> > > >
> > > > bDoUnidir = false;
> > > >
> > > > @@ -2096,6 +2106,10 @@
> > > > int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
> > > > MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
> > > >
> > > > + MV lmv = getLowresMV(cu, cuGeom, list, ref);
> > > > + if (lmv.notZero())
> > > > + mvc[numMvc++] = lmv;
> > > > +
> > > > setSearchRange(cu, mvp, m_param->searchRange,
> mvmin,
> > > > mvmax);
> > > > int satdCost =
> > > > m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp,
> numMvc,
> > > > mvc, m_param->searchRange, outmv);
> > > >
> > > > @@ -3444,3 +3458,31 @@
> > > > cu.setQPSubParts(cu.getRefQP(0), 0, cuGeom.depth);
> > > > }
> > > > }
> > > > +
> > > > +MV Search::getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
> list,
> > > > int ref)
> > > > +{
> > > > + MV lmv = 0;
> > > > + if (g_maxCUSize >> cuGeom.depth == 16)
> > > > + {
> > > > + int curPoc = m_slice->m_poc;
> > > > + int refPoc = m_slice->m_refPicList[list][ref]->m_poc;
> > > > + int diffPoc = abs(curPoc - refPoc);
> > > > +
> > > > + if (diffPoc <= m_param->bframes + 1)
> > > > + {
> > > > + MV *mv = m_frame->m_lowres.lowresMvs[list][diffPoc - 1];
> > > > + uint32_t block_x = cu.m_cuPelX +
> > > > g_zscanToPelX[cuGeom.absPartIdx];
> > > > + uint32_t block_y = cu.m_cuPelY +
> > > > g_zscanToPelY[cuGeom.absPartIdx];
> > > > +
> > > > + /* number of blocks per row in lowres*/
> > > > + uint32_t stride = ((m_param->sourceWidth / 2) +
> > > > X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
> > > > +
> > > > + uint32_t idx = ((block_y / 16) * stride) + (block_x /
> 16);
> > > > + /* check whether this motion search was performed by
> > > > lookahead */
> > > > + if (mv[0].x != 0x7FFF)
> > > > + lmv = mv[idx];
> > > >
> > >
> > > The only change I made was to move this check up.
> > >
> > > + }
> > > > + }
> > > > +
> > > > + return lmv;
> > > > +}
> > > > diff -r 479087422e29 -r def132fbcf33 source/encoder/search.h
> > > > --- a/source/encoder/search.h Wed May 13 16:52:59 2015 -0700
> > > > +++ b/source/encoder/search.h Thu May 14 10:53:45 2015 +0530
> > > > @@ -319,6 +319,8 @@
> > > > void checkDQP(Mode& mode, const CUGeom& cuGeom);
> > > > void checkDQPForSplitPred(Mode& mode, const CUGeom& cuGeom);
> > > >
> > > > + MV getLowresMV(const CUData& cu, const CUGeom& cuGeom, int
> list, int
> > > > ref);
> > > > +
> > > > class PME : public BondedTaskGroup
> > > > {
> > > > public:
> > > > @@ -339,7 +341,7 @@
> > > > };
> > > >
> > > > void processPME(PME& pme, Search& slave);
> > > > - void singleMotionEstimation(Search& master, Mode& interMode,
> > > > const PredictionUnit& pu, int part, int list, int ref);
> > > > + void singleMotionEstimation(Search& master, Mode& interMode,
> > > > const CUGeom& cuGeom, const PredictionUnit& pu, int part, int list,
> int
> > > > ref);
> > > >
> > > > protected:
> > > >
> > > > _______________________________________________
> > > > x265-devel mailing list
> > > > x265-devel at videolan.org
> > > > https://mailman.videolan.org/listinfo/x265-devel
> > > >
> >
> > > _______________________________________________
> > > x265-devel mailing list
> > > x265-devel at videolan.org
> > > https://mailman.videolan.org/listinfo/x265-devel
> >
> >
> > --
> > Steve Borho
>
> --
> Steve Borho
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
--
Thanks & Regards
Gopu G
Multicoreware Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150518/4a5b0a63/attachment-0001.html>
More information about the x265-devel
mailing list