[x265-commits] [x265] rc: fixes for 2 pass + vbv to calculate frameSizePlanned ...

Aarthi at videolan.org Aarthi at videolan.org
Wed Sep 17 10:40:42 CEST 2014


details:   http://hg.videolan.org/x265/rev/7c1aba99f40d
branches:  
changeset: 8071:7c1aba99f40d
user:      Aarthi Thirumalai
date:      Mon Sep 15 10:33:53 2014 +0530
description:
rc: fixes for 2 pass + vbv to calculate frameSizePlanned accurately.
Subject: [x265] search: save a few cycles

details:   http://hg.videolan.org/x265/rev/a1fc4e9bba51
branches:  
changeset: 8072:a1fc4e9bba51
user:      Steve Borho <steve at borho.org>
date:      Mon Sep 15 17:29:40 2014 +0200
description:
search: save a few cycles
Subject: [x265] denoiseDct: test bench code

details:   http://hg.videolan.org/x265/rev/63a78516630c
branches:  
changeset: 8073:63a78516630c
user:      Praveen Tiwari
date:      Tue Sep 16 12:20:30 2014 +0530
description:
denoiseDct: test bench code
Subject: [x265] analysis: intra picture estimation (mode and split decision)information  sharing

details:   http://hg.videolan.org/x265/rev/7784ad03d6d4
branches:  
changeset: 8074:7784ad03d6d4
user:      Gopu Govindaswamy <gopu at multicorewareinc.com>
date:      Tue Sep 16 16:50:56 2014 +0530
description:
analysis: intra picture estimation (mode and split decision)information  sharing

when --analysis-mode=save - the encoder runs a full encode and dump the
best split and mode decisions into x265_analysis.dat(default file name if file
name is not provided) file
when --analysis-mode=load - the encoder reads the best split and mode decisions
from x265_analysis.dat and bypass the actual split and mode decisions, and
therefore perform a much faster encode
Subject: [x265] analysis: nits

details:   http://hg.videolan.org/x265/rev/b276d567d771
branches:  
changeset: 8075:b276d567d771
user:      Steve Borho <steve at borho.org>
date:      Tue Sep 16 13:41:45 2014 +0200
description:
analysis: nits
Subject: [x265] analysis: add CU specific details to encodeCU()

details:   http://hg.videolan.org/x265/rev/06bac60ee4cf
branches:  
changeset: 8076:06bac60ee4cf
user:      Santhoshini Sekar <santhoshini at multicorewareinc.com>
date:      Tue Sep 16 11:53:32 2014 +0530
description:
analysis: add CU specific details to encodeCU()
Subject: [x265] api: do not reuse the analysisData buffer for more then one picture, set it NULL

details:   http://hg.videolan.org/x265/rev/d71d363c0dbb
branches:  
changeset: 8077:d71d363c0dbb
user:      Gopu Govindaswamy <gopu at multicorewareinc.com>
date:      Tue Sep 16 17:16:53 2014 +0530
description:
api: do not reuse the analysisData buffer for more then one picture, set it NULL
Subject: [x265] add fanout validation module to check param compatibility

details:   http://hg.videolan.org/x265/rev/199e8f2e0d54
branches:  
changeset: 8078:199e8f2e0d54
user:      Sagar Kotecha <sagar at multicorewareinc.com>
date:      Tue Sep 16 17:50:06 2014 +0530
description:
add fanout validation module to check param compatibility

diffstat:

 source/Lib/TLibCommon/TComRom.cpp |   12 ++
 source/Lib/TLibCommon/TComRom.h   |    2 +
 source/common/param.cpp           |    2 +-
 source/common/param.h             |    2 +
 source/encoder/analysis.cpp       |  171 +++++++++++++++++++++++++++++++++++--
 source/encoder/analysis.h         |    3 +-
 source/encoder/api.cpp            |    8 +
 source/encoder/encoder.cpp        |   11 +-
 source/encoder/entropy.cpp        |   39 ++++----
 source/encoder/entropy.h          |    2 +-
 source/encoder/frameencoder.cpp   |    2 +
 source/encoder/ratecontrol.cpp    |   19 ++-
 source/encoder/search.cpp         |   65 ++++++++++++-
 source/encoder/search.h           |    1 +
 source/test/mbdstharness.cpp      |   58 ++++++++++++
 source/test/mbdstharness.h        |    8 +
 source/test/testharness.h         |    1 +
 source/x265.cpp                   |  120 +++++++++++++++++++++++++-
 18 files changed, 464 insertions(+), 62 deletions(-)

diffs (truncated from 861 to 300 lines):

diff -r 1de67321275e -r 199e8f2e0d54 source/Lib/TLibCommon/TComRom.cpp
--- a/source/Lib/TLibCommon/TComRom.cpp	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/Lib/TLibCommon/TComRom.cpp	Tue Sep 16 17:50:06 2014 +0530
@@ -505,5 +505,17 @@ const uint8_t g_intraFilterFlags[35] =
     0x38, 
 };
 
+/* Contains how much to increment shared depth buffer for different ctu sizes to get next best depth
+ * here, depth 0 = 64x64, depth 1 = 32x32, depth 2 = 16x16 and depth 3 = 8x8
+ * if ctu = 64, depth buffer size is 256 combination of depth values 0, 1, 2, 3
+ * if ctu = 32, depth buffer size is 64 combination of depth values 1, 2, 3
+ * if ctu = 16, depth buffer size is 16 combination of depth values 2, 3 */
+const uint32_t g_depthInc[3][4] =
+{
+    { 16,  4,  0, 0},
+    { 64, 16,  4, 1},
+    {256, 64, 16, 4}
+};
+
 }
 //! \}
diff -r 1de67321275e -r 199e8f2e0d54 source/Lib/TLibCommon/TComRom.h
--- a/source/Lib/TLibCommon/TComRom.h	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/Lib/TLibCommon/TComRom.h	Tue Sep 16 17:50:06 2014 +0530
@@ -155,6 +155,8 @@ extern const uint8_t x265_exp2_lut[64];
 // Intra tables
 extern const uint8_t g_intraFilterFlags[35];
 
+extern const uint32_t g_depthInc[3][4];
+
 }
 
 #endif  //ifndef X265_TCOMROM_H
diff -r 1de67321275e -r 199e8f2e0d54 source/common/param.cpp
--- a/source/common/param.cpp	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/common/param.cpp	Tue Sep 16 17:50:06 2014 +0530
@@ -1191,7 +1191,7 @@ char *x265_param2string(x265_param *p)
 {
     char *buf, *s;
 
-    buf = s = X265_MALLOC(char, 2000);
+    buf = s = X265_MALLOC(char, MAXPARAMSIZE);
     if (!buf)
         return NULL;
 
diff -r 1de67321275e -r 199e8f2e0d54 source/common/param.h
--- a/source/common/param.h	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/common/param.h	Tue Sep 16 17:50:06 2014 +0530
@@ -38,6 +38,8 @@ bool  parseLambdaFile(x265_param *param)
 
 /* this table is kept internal to avoid confusion, since log level indices start at -1 */
 static const char * const logLevelNames[] = { "none", "error", "warning", "info", "debug", "full", 0 };
+
+#define MAXPARAMSIZE 2000
 }
 
 #endif // ifndef X265_PARAM_H
diff -r 1de67321275e -r 199e8f2e0d54 source/encoder/analysis.cpp
--- a/source/encoder/analysis.cpp	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/encoder/analysis.cpp	Tue Sep 16 17:50:06 2014 +0530
@@ -301,7 +301,6 @@ void Analysis::compressCU(TComDataCU* cu
 {
     if (cu->m_slice->m_pps->bUseDQP)
         m_bEncodeDQP = true;
-    loadCTUData(cu);
 
     // initialize CU data
     m_bestCU[0]->initCU(cu->m_pic, cu->getAddr());
@@ -311,14 +310,25 @@ void Analysis::compressCU(TComDataCU* cu
     uint32_t numPartition = cu->getTotalNumPart();
     if (m_bestCU[0]->m_slice->m_sliceType == I_SLICE)
     {
-        compressIntraCU(m_bestCU[0], m_tempCU[0], false, cu, cu->m_CULocalData);
-        if (m_param->analysisMode == 1)
+        if (m_param->analysisMode == X265_ANALYSIS_LOAD && m_bestCU[0]->m_pic->m_intraData)
         {
-            memcpy(&m_bestCU[0]->m_pic->m_intraData->depth[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getDepth(), sizeof(uint8_t) * cu->getTotalNumPart());
-            memcpy(&m_bestCU[0]->m_pic->m_intraData->modes[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getLumaIntraDir(), sizeof(uint8_t) * cu->getTotalNumPart());
-            memcpy(&m_bestCU[0]->m_pic->m_intraData->partSizes[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getPartitionSize(), sizeof(char) * cu->getTotalNumPart());
-            m_bestCU[0]->m_pic->m_intraData->cuAddr[cu->getAddr()] = cu->getAddr();
-            m_bestCU[0]->m_pic->m_intraData->poc[cu->getAddr()]    = cu->m_pic->m_POC;
+            uint32_t zOrder = 0;
+            compressSharedIntraCTU(m_bestCU[0], m_tempCU[0], false, cu, cu->m_CULocalData, 
+                &m_bestCU[0]->m_pic->m_intraData->depth[cu->getAddr() * cu->m_numPartitions],
+                &m_bestCU[0]->m_pic->m_intraData->partSizes[cu->getAddr() * cu->m_numPartitions],
+                &m_bestCU[0]->m_pic->m_intraData->modes[cu->getAddr() * cu->m_numPartitions], zOrder);
+        }
+        else
+        {
+            compressIntraCU(m_bestCU[0], m_tempCU[0], false, cu, cu->m_CULocalData);
+            if (m_param->analysisMode == X265_ANALYSIS_SAVE && m_bestCU[0]->m_pic->m_intraData)
+            {
+                memcpy(&m_bestCU[0]->m_pic->m_intraData->depth[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getDepth(), sizeof(uint8_t) * cu->getTotalNumPart());
+                memcpy(&m_bestCU[0]->m_pic->m_intraData->modes[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getLumaIntraDir(), sizeof(uint8_t) * cu->getTotalNumPart());
+                memcpy(&m_bestCU[0]->m_pic->m_intraData->partSizes[cu->getAddr() * cu->m_numPartitions], m_bestCU[0]->getPartitionSize(), sizeof(char) * cu->getTotalNumPart());
+                m_bestCU[0]->m_pic->m_intraData->cuAddr[cu->getAddr()] = cu->getAddr();
+                m_bestCU[0]->m_pic->m_intraData->poc[cu->getAddr()]    = cu->m_pic->m_POC;
+            }
         }
         if (m_param->bLogCuStats || m_param->rc.bStatWrite)
         {
@@ -424,9 +434,9 @@ void Analysis::compressIntraCU(TComDataC
     if (cu_unsplit_flag)
     {
         m_quant.setQPforQuant(outTempCU);
-        checkIntra(outBestCU, outTempCU, SIZE_2Nx2N, cu);
+        checkIntra(outBestCU, outTempCU, SIZE_2Nx2N, cu, NULL);
         if (depth == g_maxCUDepth)
-            checkIntra(outBestCU, outTempCU, SIZE_NxN, cu);
+            checkIntra(outBestCU, outTempCU, SIZE_NxN, cu, NULL);
         else
         {
             m_entropyCoder->resetBits();
@@ -533,7 +543,141 @@ void Analysis::compressIntraCU(TComDataC
 #endif
 }
 
-void Analysis::checkIntra(TComDataCU*& outBestCU, TComDataCU*& outTempCU, PartSize partSize, CU *cu)
+void Analysis::compressSharedIntraCTU(TComDataCU*& outBestCU, TComDataCU*& outTempCU, uint32_t depth, TComDataCU* cuPicsym, CU *cu, uint8_t* sharedDepth, char* sharedPartSizes, uint8_t* sharedModes, uint32_t &zOrder)
+{
+    Frame* pic = outBestCU->m_pic;
+
+    // if current depth == shared depth then skip further splitting.
+    bool bSubBranch = true;
+
+    // index to g_depthInc array to increment zOrder offset to next depth
+    int32_t ctuToDepthIndex = g_maxCUDepth - 1;
+
+    if (depth)
+        m_origYuv[0]->copyPartToYuv(m_origYuv[depth], outBestCU->getZorderIdxInCU());
+    else
+        m_origYuv[depth]->copyFromPicYuv(pic->getPicYuvOrg(), outBestCU->getAddr(), outBestCU->getZorderIdxInCU());
+
+    Slice* slice = outTempCU->m_slice;
+    int32_t cu_split_flag = !(cu->flags & CU::LEAF);
+    int32_t cu_unsplit_flag = !(cu->flags & CU::SPLIT_MANDATORY);
+
+    if (cu_unsplit_flag && ((zOrder == outBestCU->getZorderIdxInCU()) && (depth == sharedDepth[zOrder])))
+    {
+        m_quant.setQPforQuant(outTempCU);
+        checkIntra(outBestCU, outTempCU, (PartSize)sharedPartSizes[zOrder], cu, &sharedModes[zOrder]);
+
+        if (!(depth == g_maxCUDepth))
+        {
+            m_entropyCoder->resetBits();
+            m_entropyCoder->codeSplitFlag(outBestCU, 0, depth);
+            outBestCU->m_totalBits += m_entropyCoder->getNumberOfWrittenBits();
+        }
+
+        // set current best CU cost to 0 marking as best CU present in shared CU data
+        outBestCU->m_totalRDCost = 0;
+        bSubBranch = false;
+
+        // increment zOrder offset to point to next best depth in sharedDepth buffer
+        zOrder += g_depthInc[ctuToDepthIndex][sharedDepth[zOrder]];
+    }
+
+    // copy original YUV samples in lossless mode
+    if (outBestCU->isLosslessCoded(0))
+        fillOrigYUVBuffer(outBestCU, m_origYuv[depth]);
+
+    // further split
+    if (cu_split_flag && bSubBranch)
+    {
+        uint32_t    nextDepth     = depth + 1;
+        TComDataCU* subBestPartCU = m_bestCU[nextDepth];
+        TComDataCU* subTempPartCU = m_tempCU[nextDepth];
+        for (uint32_t partUnitIdx = 0; partUnitIdx < 4; partUnitIdx++)
+        {
+            CU *child_cu = cuPicsym->m_CULocalData + cu->childIdx + partUnitIdx;
+
+            if (child_cu->flags & CU::PRESENT)
+            {
+                int32_t qp = outTempCU->getQP(0);
+                subBestPartCU->initSubCU(outTempCU, partUnitIdx, nextDepth, qp); // clear sub partition datas or init.
+                subTempPartCU->initSubCU(outTempCU, partUnitIdx, nextDepth, qp); // clear sub partition datas or init.
+
+                if (partUnitIdx) // initialize RD with previous depth buffer
+                    m_rdEntropyCoders[nextDepth][CI_CURR_BEST].load(m_rdEntropyCoders[nextDepth][CI_NEXT_BEST]);
+                else
+                    m_rdEntropyCoders[nextDepth][CI_CURR_BEST].load(m_rdEntropyCoders[depth][CI_CURR_BEST]);
+
+                // set current best CU cost to 1 marking as non-best CU by default
+                subTempPartCU->m_totalRDCost = 1;
+
+                compressSharedIntraCTU(subBestPartCU, subTempPartCU, nextDepth, cuPicsym, child_cu, sharedDepth, sharedPartSizes, sharedModes, zOrder);
+                outTempCU->copyPartFrom(subBestPartCU, partUnitIdx, nextDepth); // Keep best part data to current temporary data.
+
+                if (!subBestPartCU->m_totalRDCost) // if cost is 0, CU is best CU
+                    outTempCU->m_totalRDCost = 0;  // set outTempCU cost to 0, so later check will use this CU as best CU
+
+                copyYuv2Tmp(subBestPartCU->getTotalNumPart() * partUnitIdx, nextDepth);
+            }
+            else
+            {
+                subBestPartCU->copyToPic(nextDepth);
+                outTempCU->copyPartFrom(subBestPartCU, partUnitIdx, nextDepth);
+
+                // increment zOrder offset to point to next best depth in sharedDepth buffer
+                zOrder += g_depthInc[ctuToDepthIndex][nextDepth];
+            }
+        }
+
+        if (cu->flags & CU::PRESENT)
+        {
+            m_entropyCoder->resetBits();
+            m_entropyCoder->codeSplitFlag(outTempCU, 0, depth);
+            outTempCU->m_totalBits += m_entropyCoder->getNumberOfWrittenBits(); // split bits
+        }
+        if (depth == slice->m_pps->maxCuDQPDepth && slice->m_pps->bUseDQP)
+        {
+            bool hasResidual = false;
+            for (uint32_t blkIdx = 0; blkIdx < outTempCU->getTotalNumPart(); blkIdx++)
+            {
+                if (outTempCU->getCbf(blkIdx, TEXT_LUMA) || outTempCU->getCbf(blkIdx, TEXT_CHROMA_U) ||
+                    outTempCU->getCbf(blkIdx, TEXT_CHROMA_V))
+                {
+                    hasResidual = true;
+                    break;
+                }
+            }
+
+            uint32_t targetPartIdx = 0;
+            if (hasResidual)
+            {
+                bool foundNonZeroCbf = false;
+                outTempCU->setQPSubCUs(outTempCU->getRefQP(targetPartIdx), outTempCU, 0, depth, foundNonZeroCbf);
+                X265_CHECK(foundNonZeroCbf, "expected to find non-zero CBF\n");
+            }
+            else
+                outTempCU->setQPSubParts(outTempCU->getRefQP(targetPartIdx), 0, depth); // set QP to default QP
+        }
+        m_rdEntropyCoders[nextDepth][CI_NEXT_BEST].store(m_rdEntropyCoders[depth][CI_TEMP_BEST]);
+        checkBestMode(outBestCU, outTempCU, depth);
+    }
+    outBestCU->copyToPic(depth);
+    copyYuv2Pic(pic, outBestCU->getAddr(), outBestCU->getZorderIdxInCU(), depth);
+
+#if CHECKED_BUILD || _DEBUG
+    X265_CHECK(outBestCU->getPartitionSize(0) != SIZE_NONE, "no best partition size\n");
+    X265_CHECK(outBestCU->getPredictionMode(0) != MODE_NONE, "no best partition mode\n");
+    if (m_rdCost.m_psyRd)
+    {
+        X265_CHECK(outBestCU->m_totalPsyCost != MAX_INT64, "no best partition cost\n");
+    }
+    else
+    {
+        X265_CHECK(outBestCU->m_totalRDCost != MAX_INT64, "no best partition cost\n");
+    }
+#endif
+}
+
+void Analysis::checkIntra(TComDataCU*& outBestCU, TComDataCU*& outTempCU, PartSize partSize, CU *cu, uint8_t* sharedModes)
 {
     //PPAScopeEvent(CheckRDCostIntra + depth);
     uint32_t depth = g_log2Size[m_param->maxCUSize] - cu->log2CUSize;
@@ -544,7 +688,10 @@ void Analysis::checkIntra(TComDataCU*& o
     uint32_t tuDepthRange[2];
     outTempCU->getQuadtreeTULog2MinSizeInCU(tuDepthRange, 0);
 
-    estIntraPredQT(outTempCU, m_origYuv[depth], m_tmpPredYuv[depth], m_tmpResiYuv[depth], m_tmpRecoYuv[depth], tuDepthRange);
+    if (sharedModes)
+        sharedEstIntraPredQT(outTempCU, m_origYuv[depth], m_tmpPredYuv[depth], m_tmpResiYuv[depth], m_tmpRecoYuv[depth], tuDepthRange, sharedModes);
+    else
+        estIntraPredQT(outTempCU, m_origYuv[depth], m_tmpPredYuv[depth], m_tmpResiYuv[depth], m_tmpRecoYuv[depth], tuDepthRange);
 
     estIntraPredChromaQT(outTempCU, m_origYuv[depth], m_tmpPredYuv[depth], m_tmpResiYuv[depth], m_tmpRecoYuv[depth]);
 
diff -r 1de67321275e -r 199e8f2e0d54 source/encoder/analysis.h
--- a/source/encoder/analysis.h	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/encoder/analysis.h	Tue Sep 16 17:50:06 2014 +0530
@@ -110,7 +110,8 @@ protected:
 
     /* Warning: The interface for these functions will undergo significant changes as a major refactor is under progress */
     void compressIntraCU(TComDataCU*& outBestCU, TComDataCU*& outTempCU, uint32_t depth, TComDataCU* cuPicsym, CU *cu);
-    void checkIntra(TComDataCU*& outBestCU, TComDataCU*& outTempCU, PartSize partSize, CU *cu);
+    void checkIntra(TComDataCU*& outBestCU, TComDataCU*& outTempCU, PartSize partSize, CU *cu, uint8_t* sharedModes);
+    void compressSharedIntraCTU(TComDataCU*& outBestCU, TComDataCU*& outTempCU, uint32_t depth, TComDataCU* cuPicsym, CU *cu, uint8_t* sharedDepth, char* sharedPartSizes, uint8_t* sharedModes, uint32_t &zOrder);
 
     void compressInterCU_rd0_4(TComDataCU*& outBestCU, TComDataCU*& outTempCU, TComDataCU* cu, uint32_t depth, TComDataCU* cuPicsym, CU *cu_t,
                                int bInsidePicture, uint32_t partitionIndex, uint32_t minDepth);
diff -r 1de67321275e -r 199e8f2e0d54 source/encoder/api.cpp
--- a/source/encoder/api.cpp	Mon Sep 15 15:00:13 2014 +0200
+++ b/source/encoder/api.cpp	Tue Sep 16 17:50:06 2014 +0530
@@ -124,6 +124,14 @@ int x265_encoder_encode(x265_encoder *en
     }
     while (numEncoded == 0 && !pic_in && encoder->m_numDelayedPic);
 
+    // do not allow reuse of these buffers for more than one picture. The
+    // encoder now owns these analysisData buffers.
+    if (pic_in)
+    {
+        pic_in->analysisData.intraData = NULL;
+        pic_in->analysisData.interData = NULL;
+    }
+
     if (pp_nal && numEncoded > 0)
     {
         *pp_nal = &encoder->m_nalList.m_nal[0];
diff -r 1de67321275e -r 199e8f2e0d54 source/encoder/encoder.cpp


More information about the x265-commits mailing list