[x265] [PATCH] [REVIEW PATCH]Use qp value of encoded CU, instead slice qp value

Ashok Kumar Mishra ashok at multicorewareinc.com
Thu Mar 10 07:33:59 CET 2016


yes, this patch can be pushed today.

On Thu, Mar 10, 2016 at 11:50 AM, Deepthi Nandakumar <
deepthi at multicorewareinc.com> wrote:

> This is a nice logical improvement. Anything stopping this patch from
> being pushed in today?
>
> On Wed, Mar 9, 2016 at 8:21 PM, Ashok Kumar Mishra <
> ashok at multicorewareinc.com> wrote:
>
>> Below are some test results after using the same qp for encoding the CU
>> and SAO distortion calculation.
>>
>> *Before(crowd_run_2160p50, preset - medium,  --bitrate 9000)*
>>  encoded 500 frames in 262.47s (1.90 fps), 8956.57 kb/s, Avg QP:43.43,
>> Global PSNR: 30.941, SSIM Mean Y: 0.7714149 ( 6.410 dB)
>>
>>  *After*
>>  encoded 500 frames in 270.96s (1.85 fps), 8954.41 kb/s, Avg QP:43.43,
>> Global PSNR: 30.978, SSIM Mean Y: 0.7734143 ( 6.448 dB)
>> --------------------------------------------------------
>>
>> *Before(crowd_run_1080p50, preset - medium)*
>>  encoded 500 frames in 81.79s (6.11 fps), 9694.14 kb/s, Avg QP:37.79,
>> Global PSNR: 31.037, SSIM Mean Y: 0.8414618 ( 7.999 dB)
>>
>>  *After*
>>  encoded 500 frames in 81.04s (6.17 fps), 9686.80 kb/s, Avg QP:37.79,
>> Global PSNR: 31.044, SSIM Mean Y: 0.8417730 ( 8.007 dB)
>>  -----------------------------------------------
>>
>>  *Before(**crowd_run_1080p50, preset - **veryslow)*
>>  encoded 500 frames in 993.57s (0.50 fps), 9519.62 kb/s, Avg QP:37.78,
>> Global PSNR: 31.498, SSIM Mean Y: 0.8494004 ( 8.222 dB)
>>
>>  *After*
>>  encoded 500 frames in 1000.08s (0.50 fps), 9511.11 kb/s, Avg QP:37.78,
>> Global PSNR: 31.503, SSIM Mean Y: 0.8496846 ( 8.230 dB)
>>
>>  ------------------------------------------------
>>  *Before(medium, crowd_run_1080p50, --bitrate 7000)*
>>  encoded 500 frames in 78.54s (6.37 fps), 6949.02 kb/s, Avg QP:39.77,
>> Global PSNR: 29.962, SSIM Mean Y: 0.8069742 ( 7.144 dB)
>>
>> * After*
>>  encoded 500 frames in 75.58s (6.62 fps), 6950.09 kb/s, Avg QP:39.76,
>> Global PSNR: 29.973, SSIM Mean Y: 0.8075229 ( 7.156 dB)
>>  -------------------------------------------------
>>
>>  *Before(veryslow, crowd_run_1080p50, --bitrate 7000)*
>>  Before
>>  encoded 500 frames in 864.42s (0.58 fps), 6945.49 kb/s, Avg QP:39.73,
>> Global PSNR: 30.418, SSIM Mean Y: 0.8157191 ( 7.345 dB)
>>
>>  After
>>  encoded 500 frames in 860.32s (0.58 fps), 6945.18 kb/s, Avg QP:39.73,
>> Global PSNR: 30.429, SSIM Mean Y: 0.8160904 ( 7.354 dB)
>>
>>
>>
>> On Wed, Mar 9, 2016 at 8:11 PM, <ashok at multicorewareinc.com> wrote:
>>
>>> # HG changeset patch
>>> # User Ashok Kumar Mishra<ashok at multicorewareinc.com>
>>> # Date 1457424244 -19800
>>> #      Tue Mar 08 13:34:04 2016 +0530
>>> # Node ID 047095eba0a40f2298df74c37abae7a2e31f4bce
>>> # Parent  88aebc166fa8e16f91d5f0acce77690003be9d91
>>> [REVIEW PATCH]Use qp value of encoded CU, instead slice qp value
>>> When dqp is enabled, qp value applied on CU for encoding is different
>>> from slice qp. So in this case the slice qp value
>>> should not be used for SAO offset distortion calculation. That means the
>>> same qp value used for encoding CU must be used
>>> for SAO offset distortion calculation.
>>>
>>> diff -r 88aebc166fa8 -r 047095eba0a4 source/encoder/frameencoder.cpp
>>> --- a/source/encoder/frameencoder.cpp   Fri Mar 04 16:59:45 2016 +0530
>>> +++ b/source/encoder/frameencoder.cpp   Tue Mar 08 13:34:04 2016 +0530
>>> @@ -439,7 +439,7 @@
>>>
>>>      m_initSliceContext.resetEntropy(*slice);
>>>
>>> -    m_frameFilter.start(m_frame, m_initSliceContext, qp);
>>> +    m_frameFilter.start(m_frame, m_initSliceContext);
>>>
>>>      /* ensure all rows are blocked prior to initializing row CTU
>>> counters */
>>>      WaveFront::clearEnabledRowMask();
>>> diff -r 88aebc166fa8 -r 047095eba0a4 source/encoder/framefilter.cpp
>>> --- a/source/encoder/framefilter.cpp    Fri Mar 04 16:59:45 2016 +0530
>>> +++ b/source/encoder/framefilter.cpp    Tue Mar 08 13:34:04 2016 +0530
>>> @@ -103,7 +103,7 @@
>>>
>>>  }
>>>
>>> -void FrameFilter::start(Frame *frame, Entropy& initState, int qp)
>>> +void FrameFilter::start(Frame *frame, Entropy& initState)
>>>  {
>>>      m_frame = frame;
>>>
>>> @@ -113,7 +113,7 @@
>>>          for(int row = 0; row < m_numRows; row++)
>>>          {
>>>              if (m_param->bEnableSAO)
>>> -                m_parallelFilter[row].m_sao.startSlice(frame,
>>> initState, qp);
>>> +                m_parallelFilter[row].m_sao.startSlice(frame,
>>> initState);
>>>
>>>              m_parallelFilter[row].m_lastCol.set(0);
>>>              m_parallelFilter[row].m_allowedCol.set(0);
>>> diff -r 88aebc166fa8 -r 047095eba0a4 source/encoder/framefilter.h
>>> --- a/source/encoder/framefilter.h      Fri Mar 04 16:59:45 2016 +0530
>>> +++ b/source/encoder/framefilter.h      Tue Mar 08 13:34:04 2016 +0530
>>> @@ -127,7 +127,7 @@
>>>      void init(Encoder *top, FrameEncoder *frame, int numRows, uint32_t
>>> numCols);
>>>      void destroy();
>>>
>>> -    void start(Frame *pic, Entropy& initState, int qp);
>>> +    void start(Frame *pic, Entropy& initState);
>>>
>>>      void processRow(int row);
>>>      void processPostRow(int row);
>>> diff -r 88aebc166fa8 -r 047095eba0a4 source/encoder/sao.cpp
>>> --- a/source/encoder/sao.cpp    Fri Mar 04 16:59:45 2016 +0530
>>> +++ b/source/encoder/sao.cpp    Tue Mar 08 13:34:04 2016 +0530
>>> @@ -76,8 +76,6 @@
>>>      m_countPreDblk = NULL;
>>>      m_offsetOrgPreDblk = NULL;
>>>      m_refDepth = 0;
>>> -    m_lumaLambda = 0;
>>> -    m_chromaLambda = 0;
>>>      m_param = NULL;
>>>      m_clipTable = NULL;
>>>      m_clipTableBase = NULL;
>>> @@ -226,17 +224,10 @@
>>>          saoParam->ctuParam[i] = new SaoCtuParam[m_numCuInHeight *
>>> m_numCuInWidth];
>>>  }
>>>
>>> -void SAO::startSlice(Frame* frame, Entropy& initState, int qp)
>>> +void SAO::startSlice(Frame* frame, Entropy& initState)
>>>  {
>>> -    Slice* slice = frame->m_encData->m_slice;
>>> -    int qpCb = qp;
>>> -    if (m_param->internalCsp == X265_CSP_I420)
>>> -        qpCb = x265_clip3(QP_MIN, QP_MAX_MAX, (int)g_chromaScale[qp +
>>> slice->m_pps->chromaQpOffset[0]]);
>>> -    else
>>> -        qpCb = X265_MIN(qp + slice->m_pps->chromaQpOffset[0],
>>> QP_MAX_SPEC);
>>> -    m_lumaLambda = x265_lambda2_tab[qp];
>>> -    m_chromaLambda = x265_lambda2_tab[qpCb]; // Use Cb QP for SAO chroma
>>>      m_frame = frame;
>>> +    Slice* slice = m_frame->m_encData->m_slice;
>>>
>>>      switch (slice->m_sliceType)
>>>      {
>>> @@ -1197,7 +1188,22 @@
>>>
>>>  void SAO::rdoSaoUnitCu(SAOParam* saoParam, int rowBaseAddr, int idxX,
>>> int addr)
>>>  {
>>> -    double lambda[3] = {m_lumaLambda, m_chromaLambda, m_chromaLambda};
>>> +    Slice* slice = m_frame->m_encData->m_slice;
>>> +//    int qp = slice->m_sliceQp;
>>> +    const CUData* cu = m_frame->m_encData->getPicCTU(addr);
>>> +    int qp = cu->m_qp[0];
>>> +
>>> +    double lambda[2] = {0.0};
>>> +
>>> +    int qpCb = qp;
>>> +    if (m_param->internalCsp == X265_CSP_I420)
>>> +        qpCb = x265_clip3(QP_MIN, QP_MAX_MAX, (int)g_chromaScale[qp +
>>> slice->m_pps->chromaQpOffset[0]]);
>>> +    else
>>> +        qpCb = X265_MIN(qp + slice->m_pps->chromaQpOffset[0],
>>> QP_MAX_SPEC);
>>> +
>>> +    lambda[0] = x265_lambda2_tab[qp];
>>> +    lambda[1] = x265_lambda2_tab[qpCb]; // Use Cb QP for SAO chroma
>>> +
>>>      const bool allowMerge[2] = {(idxX != 0), (rowBaseAddr != 0)}; //
>>> left, up
>>>
>>>      const int addrMerge[2] = {(idxX ? addr - 1 : -1), (rowBaseAddr ?
>>> addr - m_numCuInWidth : -1)};// left, up
>>> @@ -1242,9 +1248,9 @@
>>>      saoStatsInitialOffset(chroma);
>>>
>>>      double mergeDist[NUM_MERGE_MODE] = { 0.0 };
>>> -    saoLumaComponentParamDist(saoParam, addr, mergeDist);
>>> +    saoLumaComponentParamDist(saoParam, addr, mergeDist, lambda);
>>>      if (chroma)
>>> -        saoChromaComponentParamDist(saoParam, addr, mergeDist);
>>> +        saoChromaComponentParamDist(saoParam, addr, mergeDist, lambda);
>>>
>>>      if (saoParam->bSaoFlag[0] || saoParam->bSaoFlag[1])
>>>      {
>>> @@ -1286,10 +1292,9 @@
>>>                      }
>>>                  }
>>>
>>> -                mergeDist[mergeIdx + 1] += ((double)estDist /
>>> lambda[plane]);
>>> +                mergeDist[mergeIdx + 1] += ((double)estDist /
>>> lambda[!!plane]);
>>>              }
>>>
>>> -
>>>              m_entropyCoder.load(m_rdContexts.cur);
>>>              m_entropyCoder.resetBits();
>>>              if (allowMerge[0])
>>> @@ -1412,7 +1417,7 @@
>>>      return bestOffset;
>>>  }
>>>
>>> -void SAO::saoLumaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist)
>>> +void SAO::saoLumaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist, double* lambda)
>>>  {
>>>      int64_t bestDist = 0;
>>>      int bestTypeIdx = -1;
>>> @@ -1426,7 +1431,7 @@
>>>      m_entropyCoder.resetBits();
>>>      m_entropyCoder.codeSaoType(0);
>>>
>>> -    double dCostPartBest = m_entropyCoder.getNumberOfWrittenBits() *
>>> m_lumaLambda;
>>> +    double dCostPartBest = m_entropyCoder.getNumberOfWrittenBits() *
>>> lambda[0];
>>>
>>>      //EO distortion calculation
>>>      for (int typeIdx = 0; typeIdx < MAX_NUM_SAO_TYPE - 1; typeIdx++)
>>> @@ -1439,7 +1444,7 @@
>>>              int32_t& offsetOut = m_offset[0][typeIdx][classIdx];
>>>
>>>              if (count)
>>> -                offsetOut = estIterOffset(typeIdx, m_lumaLambda,
>>> offsetOut, count, offsetOrg, distBOClasses[0], costBOClasses[0]);
>>> +                offsetOut = estIterOffset(typeIdx, lambda[0],
>>> offsetOut, count, offsetOrg, distBOClasses[0], costBOClasses[0]);
>>>              else
>>>                  offsetOut = 0;
>>>
>>> @@ -1451,7 +1456,7 @@
>>>          m_entropyCoder.codeSaoOffsetEO(m_offset[0][typeIdx] + 1,
>>> typeIdx, 0);
>>>
>>>          uint32_t estRate = m_entropyCoder.getNumberOfWrittenBits();
>>> -        double cost = (double)estDist + m_lumaLambda * (double)estRate;
>>> +        double cost = (double)estDist + lambda[0] * (double)estRate;
>>>
>>>          if (cost < dCostPartBest)
>>>          {
>>> @@ -1479,10 +1484,10 @@
>>>          int32_t& offsetOut = m_offset[0][SAO_BO][classIdx];
>>>
>>>          distBOClasses[classIdx] = 0;
>>> -        costBOClasses[classIdx] = m_lumaLambda;
>>> +        costBOClasses[classIdx] = lambda[0];
>>>
>>>          if (count)
>>> -            offsetOut = estIterOffset(SAO_BO, m_lumaLambda, offsetOut,
>>> count, offsetOrg, distBOClasses[classIdx], costBOClasses[classIdx]);
>>> +            offsetOut = estIterOffset(SAO_BO, lambda[0], offsetOut,
>>> count, offsetOrg, distBOClasses[classIdx], costBOClasses[classIdx]);
>>>          else
>>>              offsetOut = 0;
>>>      }
>>> @@ -1513,7 +1518,7 @@
>>>      m_entropyCoder.codeSaoOffsetBO(m_offset[0][SAO_BO] + bestClassBO,
>>> bestClassBO, 0);
>>>
>>>      uint32_t estRate = m_entropyCoder.getNumberOfWrittenBits();
>>> -    double cost = (double)estDist + m_lumaLambda * (double)estRate;
>>> +    double cost = (double)estDist + lambda[0] * (double)estRate;
>>>
>>>      if (cost < dCostPartBest)
>>>      {
>>> @@ -1527,13 +1532,13 @@
>>>              lclCtuParam->offset[classIdx] =
>>> (int)m_offset[0][SAO_BO][classIdx + bestClassBO];
>>>      }
>>>
>>> -    mergeDist[0] = ((double)bestDist / m_lumaLambda);
>>> +    mergeDist[0] = ((double)bestDist / lambda[0]);
>>>      m_entropyCoder.load(m_rdContexts.temp);
>>>      m_entropyCoder.codeSaoOffset(*lclCtuParam, 0);
>>>      m_entropyCoder.store(m_rdContexts.temp);
>>>  }
>>>
>>> -void SAO::saoChromaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist)
>>> +void SAO::saoChromaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist, double* lambda)
>>>  {
>>>      int64_t bestDist = 0;
>>>      int bestTypeIdx = -1;
>>> @@ -1548,7 +1553,7 @@
>>>      m_entropyCoder.resetBits();
>>>      m_entropyCoder.codeSaoType(0);
>>>
>>> -    double costPartBest = m_entropyCoder.getNumberOfWrittenBits() *
>>> m_chromaLambda;
>>> +    double costPartBest = m_entropyCoder.getNumberOfWrittenBits() *
>>> lambda[1];
>>>
>>>      //EO RDO
>>>      for (int typeIdx = 0; typeIdx < MAX_NUM_SAO_TYPE - 1; typeIdx++)
>>> @@ -1563,7 +1568,7 @@
>>>                  int32_t& offsetOut =
>>> m_offset[compIdx][typeIdx][classIdx];
>>>
>>>                  if (count)
>>> -                    offsetOut = estIterOffset(typeIdx, m_chromaLambda,
>>> offsetOut, count, offsetOrg, distBOClasses[0], costBOClasses[0]);
>>> +                    offsetOut = estIterOffset(typeIdx, lambda[1],
>>> offsetOut, count, offsetOrg, distBOClasses[0], costBOClasses[0]);
>>>                  else
>>>                      offsetOut = 0;
>>>
>>> @@ -1578,7 +1583,7 @@
>>>              m_entropyCoder.codeSaoOffsetEO(m_offset[compIdx +
>>> 1][typeIdx] + 1, typeIdx, compIdx + 1);
>>>
>>>          uint32_t estRate = m_entropyCoder.getNumberOfWrittenBits();
>>> -        double cost = (double)(estDist[0] + estDist[1]) +
>>> m_chromaLambda * (double)estRate;
>>> +        double cost = (double)(estDist[0] + estDist[1]) + lambda[1] *
>>> (double)estRate;
>>>
>>>          if (cost < costPartBest)
>>>          {
>>> @@ -1615,10 +1620,10 @@
>>>              int32_t& offsetOut = m_offset[compIdx][SAO_BO][classIdx];
>>>
>>>              distBOClasses[classIdx] = 0;
>>> -            costBOClasses[classIdx] = m_chromaLambda;
>>> +            costBOClasses[classIdx] = lambda[1];
>>>
>>>              if (count)
>>> -                offsetOut = estIterOffset(SAO_BO, m_chromaLambda,
>>> offsetOut, count, offsetOrg, distBOClasses[classIdx],
>>> costBOClasses[classIdx]);
>>> +                offsetOut = estIterOffset(SAO_BO, lambda[1], offsetOut,
>>> count, offsetOrg, distBOClasses[classIdx], costBOClasses[classIdx]);
>>>              else
>>>                  offsetOut = 0;
>>>          }
>>> @@ -1648,7 +1653,7 @@
>>>          m_entropyCoder.codeSaoOffsetBO(m_offset[compIdx + 1][SAO_BO] +
>>> bestClassBO[compIdx], bestClassBO[compIdx], compIdx + 1);
>>>
>>>      uint32_t estRate = m_entropyCoder.getNumberOfWrittenBits();
>>> -    double cost = (double)(estDist[0] + estDist[1]) + m_chromaLambda *
>>> (double)estRate;
>>> +    double cost = (double)(estDist[0] + estDist[1]) + lambda[1] *
>>> (double)estRate;
>>>
>>>      if (cost < costPartBest)
>>>      {
>>> @@ -1665,7 +1670,7 @@
>>>          }
>>>      }
>>>
>>> -    mergeDist[0] += ((double)bestDist / m_chromaLambda);
>>> +    mergeDist[0] += ((double)bestDist / lambda[1]);
>>>      m_entropyCoder.load(m_rdContexts.temp);
>>>      m_entropyCoder.codeSaoOffset(*lclCtuParam[0], 1);
>>>      m_entropyCoder.codeSaoOffset(*lclCtuParam[1], 2);
>>> diff -r 88aebc166fa8 -r 047095eba0a4 source/encoder/sao.h
>>> --- a/source/encoder/sao.h      Fri Mar 04 16:59:45 2016 +0530
>>> +++ b/source/encoder/sao.h      Tue Mar 08 13:34:04 2016 +0530
>>> @@ -114,10 +114,6 @@
>>>      int         m_refDepth;
>>>      int         m_numNoSao[2];
>>>
>>> -    double      m_lumaLambda;
>>> -    double      m_chromaLambda;
>>> -    /* TODO: No doubles for distortion */
>>> -
>>>      SAO();
>>>
>>>      bool create(x265_param* param, int initCommon);
>>> @@ -126,7 +122,7 @@
>>>
>>>      void allocSaoParam(SAOParam* saoParam) const;
>>>
>>> -    void startSlice(Frame* pic, Entropy& initState, int qp);
>>> +    void startSlice(Frame* pic, Entropy& initState);
>>>      void resetStats();
>>>
>>>      // CTU-based SAO process without slice granularity
>>> @@ -138,8 +134,8 @@
>>>      void calcSaoStatsCu(int addr, int plane);
>>>      void calcSaoStatsCu_BeforeDblk(Frame* pic, int idxX, int idxY);
>>>
>>> -    void saoLumaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist);
>>> -    void saoChromaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist);
>>> +    void saoLumaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist, double* lambda);
>>> +    void saoChromaComponentParamDist(SAOParam* saoParam, int addr,
>>> double* mergeDist, double* lambda);
>>>
>>>      inline int estIterOffset(int typeIdx, double lambda, int offset,
>>> int32_t count, int32_t offsetOrg,
>>>                               int& currentDistortionTableBo, double&
>>> currentRdCostTableBo);
>>>
>>
>>
>> _______________________________________________
>> x265-devel mailing list
>> x265-devel at videolan.org
>> https://mailman.videolan.org/listinfo/x265-devel
>>
>>
>
>
> --
> Deepthi Nandakumar
> Engineering Manager, x265
> Multicoreware, Inc
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20160310/b26902fb/attachment-0001.html>


More information about the x265-devel mailing list