[x265] [PATCH] Using best reference from Lookahead in RPS (for non-B frames only)

Steve Borho steve at borho.org
Fri Sep 6 08:06:07 CEST 2013


On Wed, Sep 4, 2013 at 5:44 AM, <shazeb at multicorewareinc.com> wrote:

> # HG changeset patch
> # User Shazeb N Khan
> # Date 1378290786 -19800
> #      Wed Sep 04 16:03:06 2013 +0530
> # Node ID e06a76856565bcb120c497d3695340dd17044ebf
> # Parent  a4cec6558ccc149e34492852152d214da9d9d2f5
> Using best reference from Lookahead in RPS (for non-B frames only)
>
> diff -r a4cec6558ccc -r e06a76856565 source/Lib/TLibCommon/TComPic.h
> --- a/source/Lib/TLibCommon/TComPic.h   Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/Lib/TLibCommon/TComPic.h   Wed Sep 04 16:03:06 2013 +0530
> @@ -73,6 +73,8 @@
>
>  public:
>
> +    int                   m_predRefPOC[MAX_NUM_REF];
> +    int                   m_predRefCount;
>      volatile uint32_t*    m_complete_enc;       // Array of Col number
> that was finished stage encode
>
>      //** Frame Parallelism - notification between FrameEncoders of
> available motion reference rows **
> diff -r a4cec6558ccc -r e06a76856565 source/common/lowres.cpp
> --- a/source/common/lowres.cpp  Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/common/lowres.cpp  Wed Sep 04 16:03:06 2013 +0530
> @@ -57,16 +57,16 @@
>
>      intraCost = (int*)X265_MALLOC(int, cuCount);
>
> -    for (int i = 0; i < bframes + 2; i++)
> +    for (int i = 0; i < X265_BFRAME_MAX + 2; i++)
>      {
> -        for (int j = 0; j < bframes + 2; j++)
> +        for (int j = 0; j < X265_BFRAME_MAX + 2; j++)
>          {
>              rowSatds[i][j] = (int*)X265_MALLOC(int, cuHeight);
>              lowresCosts[i][j] = (uint16_t*)X265_MALLOC(uint16_t, cuCount);
>          }
>      }
>

This is adding roughly 20MB to every TComPic in the encoder, and the
lookahead queue can be pretty deep (60 frames is typical)


> -    for (int i = 0; i < bframes + 1; i++)
> +    for (int i = 0; i < X265_BFRAME_MAX + 1; i++)
>      {
>          lowresMvs[0][i] = (MV*)X265_MALLOC(MV, cuCount);
>          lowresMvs[1][i] = (MV*)X265_MALLOC(MV, cuCount);
>

This is adding one more MB per picture


> diff -r a4cec6558ccc -r e06a76856565 source/encoder/dpb.cpp
> --- a/source/encoder/dpb.cpp    Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/encoder/dpb.cpp    Wed Sep 04 16:03:06 2013 +0530
> @@ -111,7 +111,19 @@
>      // Do decoding refresh marking if any
>      decodingRefreshMarking(pocCurr, slice->getNalUnitType());
>
> -    computeRPS(pocCurr, slice->isIRAP(), slice->getLocalRPS(),
> slice->getSPS()->getMaxDecPicBuffering(0));
> +#if 0
> +    if(m_cfg->param.bframes)  // Lookahead references for B frames not in
> place yet
> +    {
> +        computeRPS(pocCurr, slice->isIRAP(), slice->getLocalRPS(),
> slice->getSPS()->getMaxDecPicBuffering(0));
> +    }
> +    else
> +    {
> +        computeRPS(pocCurr, slice->isIRAP(), slice->getLocalRPS(),
> slice->getSPS()->getMaxDecPicBuffering(0), pic->m_predRefPOC,
> pic->m_predRefCount);
> +    }
> +#else if
> +        computeRPS(pocCurr, slice->isIRAP(), slice->getLocalRPS(),
> slice->getSPS()->getMaxDecPicBuffering(0));
> +#endif
> +
>      slice->setRPS(slice->getLocalRPS());
>      slice->setRPSidx(-1);              //   To force using RPS from
> slice, rather than from SPS
>
> @@ -281,6 +293,43 @@
>      rps->sortDeltaPOC();
>  }
>
> +bool isInArray(int *arr, int size, int num)
> +{
> +    for(int i=0;i<size;i++)
> +        if(arr[i]==num)
> +        {return(true);}
> +    return(false);
> +}
>

this function has a plethora of white-space issues.


> +
> +void DPB::computeRPS(int curPoc, bool isRAP, TComReferencePictureSet *
> rps, unsigned int maxDecPicBuffer, int *predRefPOC, int predRefCount)
> +{
> +    TComPic * refPic;
> +    unsigned int poci = 0, numNeg = 0, numPos = 0;
> +
> +    TComList<TComPic*>::iterator iterPic = m_picList.begin();
> +    while ((iterPic != m_picList.end()) && (poci < (maxDecPicBuffer)))
> +    {
> +        refPic = *(iterPic);
> +        if ((refPic->getPOC() != curPoc) &&
> (refPic->getSlice()->isReferenced()) && isInArray(predRefPOC, predRefCount,
> refPic->getPOC()))
> +        {
> +            rps->m_POC[poci] = refPic->getPOC();
> +            rps->m_deltaPOC[poci] = rps->m_POC[poci] - curPoc;
> +            (rps->m_deltaPOC[poci] < 0) ? numNeg++ : numPos++;
> +            rps->m_used[poci] = !isRAP;
> +            poci++;
> +        }
> +        iterPic++;
> +    }
> +
> +    rps->m_numberOfPictures = poci;
> +    rps->m_numberOfPositivePictures = numPos;
> +    rps->m_numberOfNegativePictures = numNeg;
> +    rps->m_numberOfLongtermPictures = 0;
> +    rps->m_interRPSPrediction = false;          // To be changed later
> when needed
> +
> +    rps->sortDeltaPOC();
> +}
>

I appreciate the general intent here to pick from available references, but
I think the benefit we get from this is greatly outweighed by the increased
memory use.  We will definitely need a function like this when we have
--b-adapt 2 working and many recent reference frames in the decoder to
select from, but for P frames only this classifies as too early an
optimization.

The x264 lookahead structures are designed for a simple sliding window,
references never extend beyond the user's configured POC distance
(param.bframes).  So a function like this won't be useful until we can
properly support large values of param.bframes.

It's pretty rare for frames many POC away to be better references than all
the closer frames, and the cost of measuring them all is expensive, which
is why large param.bframes values are only enabled in the slowest x264
presets.


> +
>  /** Function for marking the reference pictures when an
> IDR/CRA/CRANT/BLA/BLANT is encountered.
>   * \param pocCRA POC of the CRA/CRANT/BLA/BLANT picture
>   * \param bRefreshPending flag indicating if a deferred decoding refresh
> is pending
> diff -r a4cec6558ccc -r e06a76856565 source/encoder/dpb.h
> --- a/source/encoder/dpb.h      Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/encoder/dpb.h      Wed Sep 04 16:03:06 2013 +0530
> @@ -66,10 +66,13 @@
>
>      void computeRPS(int curPoc, bool isRAP, TComReferencePictureSet *
> rps, unsigned int maxDecPicBuffer);
>
> +    // Taking references from lookahead; not tested for B frames
> +    void computeRPS(int curPoc, bool isRAP, TComReferencePictureSet *
> rps, unsigned int maxDecPicBuffer, int *predRefPOC, int predRefCount);
> +
>      void applyReferencePictureSet(TComReferencePictureSet *rps, int
> curPoc);
>      void decodingRefreshMarking(int pocCurr, NalUnitType nalUnitType);
>
> -    void arrangeLongtermPicturesInRPS(TComSlice *, FrameEncoder
> *frameEncoder);
> +    void arrangeLongtermPicturesInRPS(TComSlice *, x265::FrameEncoder
> *frameEncoder);
>

this change should be dropped


>
>      NalUnitType getNalUnitType(int curPoc, int lastIdr);
>  };
> diff -r a4cec6558ccc -r e06a76856565 source/encoder/slicetype.cpp
> --- a/source/encoder/slicetype.cpp      Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/encoder/slicetype.cpp      Wed Sep 04 16:03:06 2013 +0530
> @@ -72,6 +72,7 @@
>      merange = 16;
>      widthInCU = ((cfg->param.sourceWidth / 2) + X265_LOWRES_CU_SIZE - 1)
> >> X265_LOWRES_CU_BITS;
>      heightInCU = ((cfg->param.sourceHeight / 2) + X265_LOWRES_CU_SIZE -
> 1) >> X265_LOWRES_CU_BITS;
> +    picPrev = NULL;
>  }
>
>  Lookahead::~Lookahead()
> @@ -106,6 +107,9 @@
>          outputQueue.pushBack(pic);
>          numDecided++;
>          lastKeyframe = 0;
> +        picPrev = pic;
> +        prevRef = NULL;   // best reference of previous
> +
>          return;
>      }
>
> @@ -113,20 +117,56 @@
>      slicetypeAnalyse(false);
>
>      // This will work only in all-P config
> -    int dframes;
> +    int dframes, d0;
> +    int costRef, costPrev;
> +    int thresh = X265_BFRAME_MAX;
>      for (dframes = 0; (frames[dframes + 1] != NULL) && (frames[dframes +
> 1]->sliceType != X265_TYPE_AUTO); dframes++)
>      {}

-
> +
>      TComPic *pic;
>      for (int i = 1; i <= dframes && i <= inputQueue.size(); i++)
>      {
>          pic = inputQueue.popFront();
> +        pic->m_predRefCount=0;
>          pic->m_lowres.gopIdx = (pic->getPOC() - 1) %
> (cfg->getGOPSizeMin());
> -        outputQueue.pushBack(pic);
>          if (pic->m_lowres.sliceType == X265_TYPE_I)
>          {
> +            picPrev = pic;
> +            prevRef = NULL;       // best reference of previous
>              lastKeyframe = pic->getPOC();
>          }
> +        else if((pic->m_lowres.sliceType == X265_TYPE_P))
> +        {
> +            if((prevRef!=NULL)&&((d0 = pic->getPOC() - prevRef->getPOC())
> < thresh))
> +            {
> +                    frames[0] = &(prevRef->m_lowres);
> +                    frames[d0]= &(pic->m_lowres);
> +                    costRef = estimateFrameCost(0, d0, d0, false);
>

what if the POC distance is greater than MAX_BFRAMES?  I don't think this
is going to scale generally.

+
> +                    d0 = 1;
> +                    frames[0] = &(picPrev->m_lowres);
> +                    frames[d0]= &(pic->m_lowres);
> +                    costPrev = estimateFrameCost(0, d0, d0, false);
> +
> +                    if(costRef < costPrev)
> +                    {
> +                        pic->m_predRefPOC[0] = prevRef->getPOC();
> +                    }
> +                    else
> +                    {
> +                        pic->m_predRefPOC[0] = picPrev->getPOC();
> +                        prevRef = picPrev;
> +                    }
> +            }
> +            else
> +            {
> +                pic->m_predRefPOC[0] = picPrev->getPOC();
> +                prevRef = picPrev;
> +            }
> +            pic->m_predRefCount=1;
> +            picPrev = pic;
> +        }
> +        outputQueue.pushBack(pic);
>      }
>
>  #else // if 0
> diff -r a4cec6558ccc -r e06a76856565 source/encoder/slicetype.h
> --- a/source/encoder/slicetype.h        Wed Sep 04 15:38:29 2013 +0530
> +++ b/source/encoder/slicetype.h        Wed Sep 04 16:03:06 2013 +0530
> @@ -47,6 +47,7 @@
>      int              merange;
>      int              numDecided;
>      int              lastKeyframe;
> +    TComPic         *picPrev, *prevRef;
>      int              widthInCU;       // width of lowres frame in
> downscale CUs
>      int              heightInCU;      // height of lowres frame in
> downscale CUs
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>


-- 
Steve Borho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.videolan.org/private/x265-devel/attachments/20130906/acb9d385/attachment.html>


More information about the x265-devel mailing list