[x265-commits] [x265] replace global g_maxCUSize with param->maxCUSize

Fri Jun 23 01:03:03 CEST 2017

details:   http://hg.videolan.org/x265/rev/c1edcf7486a8
branches:  
changeset: 11828:c1edcf7486a8
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Wed Jun 21 11:05:02 2017 +0530
description:
replace global g_maxCUSize with param->maxCUSize
Subject: [x265] add maxLog2CUSize to param and use in place of g_maxLog2CUSize

details:   http://hg.videolan.org/x265/rev/da718982ca7b
branches:  
changeset: 11829:da718982ca7b
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Mon Jun 12 17:21:57 2017 +0530
description:
add maxLog2CUSize to param and use in place of g_maxLog2CUSize
Subject: [x265] add maxCUDepth to param to replace global g_maxCUDepth

details:   http://hg.videolan.org/x265/rev/00e74aa3d57f
branches:  
changeset: 11830:00e74aa3d57f
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Tue May 30 10:45:50 2017 +0530
description:
add maxCUDepth to param to replace global g_maxCUDepth
Subject: [x265] replace g_unitSizeDepth with param member

details:   http://hg.videolan.org/x265/rev/ee4fb3111cf9
branches:  
changeset: 11831:ee4fb3111cf9
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Tue May 30 10:58:25 2017 +0530
description:
replace g_unitSizeDepth with param member
Subject: [x265] use param to replace MACRO NUM_4x4_PARTITIONS

details:   http://hg.videolan.org/x265/rev/ce8c60bf7771
branches:  
changeset: 11832:ce8c60bf7771
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Tue May 30 12:24:38 2017 +0530
description:
use param to replace MACRO NUM_4x4_PARTITIONS
Subject: [x265] replace g_maxSlices with maxSlices of param

details:   http://hg.videolan.org/x265/rev/53e9cd448020
branches:  
changeset: 11833:53e9cd448020
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Mon Jun 12 17:24:55 2017 +0530
description:
replace g_maxSlices with maxSlices of param
Subject: [x265] remove global declarations and initialization function

details:   http://hg.videolan.org/x265/rev/77d58f20a879
branches:  
changeset: 11834:77d58f20a879
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Tue May 30 14:49:37 2017 +0530
description:
remove global declarations and initialization function
Subject: [x265] add param option to specify file read/write of analysis data

details:   http://hg.videolan.org/x265/rev/0d5a7a277aad
branches:  
changeset: 11835:0d5a7a277aad
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Tue Jun 13 14:00:40 2017 +0530
description:
add param option to specify file read/write of analysis data
Subject: [x265] rename options related to analysis-mode, improve docs

details:   http://hg.videolan.org/x265/rev/586d06ad195a
branches:  
changeset: 11836:586d06ad195a
user:      Kavitha Sampath <kavitha at multicorewareinc.com>
date:      Wed Jun 21 10:03:53 2017 +0530
description:
rename options related to analysis-mode, improve docs

diffstat:

 doc/reST/cli.rst                |   41 ++--
 source/CMakeLists.txt           |    2 +-
 source/common/common.h          |    1 -
 source/common/constants.cpp     |    1 -
 source/common/constants.h       |    2 -
 source/common/cudata.cpp        |   54 +++---
 source/common/cudata.h          |   16 +-
 source/common/frame.cpp         |   23 +-
 source/common/framedata.cpp     |    4 +-
 source/common/param.cpp         |   57 +-----
 source/common/param.h           |    1 -
 source/common/picyuv.cpp        |   40 ++--
 source/common/picyuv.h          |    3 +-
 source/common/slice.cpp         |   12 +-
 source/common/slice.h           |    1 +
 source/encoder/analysis.cpp     |   66 +++---
 source/encoder/api.cpp          |   18 +-
 source/encoder/dpb.cpp          |    6 +-
 source/encoder/encoder.cpp      |  333 +++++++++++++++++++++------------------
 source/encoder/encoder.h        |    2 +-
 source/encoder/entropy.cpp      |   14 +-
 source/encoder/frameencoder.cpp |   38 ++--
 source/encoder/framefilter.cpp  |   26 +-
 source/encoder/framefilter.h    |    2 +-
 source/encoder/motion.cpp       |    3 +-
 source/encoder/motion.h         |    2 +-
 source/encoder/ratecontrol.cpp  |    2 +-
 source/encoder/reference.cpp    |   12 +-
 source/encoder/sao.cpp          |   34 ++--
 source/encoder/search.cpp       |   30 +-
 source/encoder/search.h         |    4 +-
 source/encoder/slicetype.cpp    |    6 +-
 source/x265-extras.cpp          |   24 +-
 source/x265.cpp                 |    4 +-
 source/x265.h                   |   34 +++-
 source/x265cli.h                |   12 +-
 36 files changed, 463 insertions(+), 467 deletions(-)

diffs (truncated from 2849 to 300 lines):

diff -r 80c23559084c -r 586d06ad195a doc/reST/cli.rst

--- a/doc/reST/cli.rst	Wed Jun 21 10:02:45 2017 +0530
+++ b/doc/reST/cli.rst	Wed Jun 21 10:03:53 2017 +0530
@@ -849,33 +849,31 @@ the prediction quad-tree.
 
 Analysis re-use options, to improve performance when encoding the same
 sequence multiple times (presumably at varying bitrates). The encoder
-will not reuse analysis if the resolution and slice type parameters do
-not match.
-
-.. option:: --analysis-mode <string|int>
-
-	Specify whether analysis information of each frame is output by encoder
-	or input for reuse. By reading the analysis data writen by an
-	earlier encode of the same sequence, substantial redundant work may
-	be avoided.
-
-	The following data may be stored and reused:
-	I frames   - split decisions and luma intra directions of all CUs.
-	P/B frames - motion vectors are dumped at each depth for all CUs.
+will not reuse analysis if slice type parameters do not match.
+
+.. option:: --analysis-reuse-mode <string|int>
+
+	This option allows reuse of analysis information from first pass to second pass.
+	:option:`--analysis-reuse-mode save` specifies that encoder outputs analysis information of each frame.
+	:option:`--analysis-reuse-mode load` specifies that encoder reuses analysis information from first pass.
+	There is no benefit using load mode without running encoder in save mode. Analysis data from save mode is
+	written to a file specified by :option:`--analysis-reuse-file`. The amount of analysis data stored/reused
+	is determined by :option:`--analysis-reuse-level`. By reading the analysis data writen by an earlier encode
+	of the same sequence, substantial redundant work may be avoided. Requires cutree, pmode to be off. Default 0.
 
 	**Values:** off(0), save(1): dump analysis data, load(2): read analysis data
 
-.. option:: --analysis-file <filename>
-
-	Specify a filename for analysis data (see :option:`--analysis-mode`)
+.. option:: --analysis-reuse-file <filename>
+
+	Specify a filename for analysis data (see :option:`--analysis-reuse-mode`)
 	If no filename is specified, x265_analysis.dat is used.
 
-.. option:: --refine-level <1..10>
-
-	Amount of information stored/reused in :option:`--analysis-mode` is distributed across levels.
+.. option:: --analysis-reuse-level <1..10>
+
+	Amount of information stored/reused in :option:`--analysis-reuse-mode` is distributed across levels.
 	Higher the value, higher the information stored/reused, faster the encode. Default 5.
 
-	Note that --refine-level must be paired with analysis-mode.
+	Note that --analysis-reuse-level must be paired with analysis-reuse-mode.
 
 	+--------+-----------------------------------------+
 	| Level  | Description                             |
@@ -888,10 +886,11 @@ not match.
 	+--------+-----------------------------------------+
 	| 10     | Level 5 + Full CU analysis-info         |
 	+--------+-----------------------------------------+
+
 .. option:: --scale-factor
 
        Factor by which input video is scaled down for analysis save mode.
-       This option should be coupled with analysis-mode option, --refine-level 10.
+       This option should be coupled with analysis-reuse-mode option, --analysis-reuse-level 10.
        The ctu size of load should be double the size of save. Default 0.
 
 .. option:: --refine-intra
diff -r 80c23559084c -r 586d06ad195a source/CMakeLists.txt
--- a/source/CMakeLists.txt	Wed Jun 21 10:02:45 2017 +0530
+++ b/source/CMakeLists.txt	Wed Jun 21 10:03:53 2017 +0530
@@ -29,7 +29,7 @@ option(NATIVE_BUILD "Target the build CP
 option(STATIC_LINK_CRT "Statically link C runtime for release builds" OFF)
 mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
 # X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 122)
+set(X265_BUILD 128)
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
                "${PROJECT_BINARY_DIR}/x265.def")
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
diff -r 80c23559084c -r 586d06ad195a source/common/common.h
--- a/source/common/common.h	Wed Jun 21 10:02:45 2017 +0530
+++ b/source/common/common.h	Wed Jun 21 10:03:53 2017 +0530
@@ -259,7 +259,6 @@ typedef int16_t  coeff_t;      // transf
 #define LOG2_RASTER_SIZE        (MAX_LOG2_CU_SIZE - LOG2_UNIT_SIZE)
 #define RASTER_SIZE             (1 << LOG2_RASTER_SIZE)
 #define MAX_NUM_PARTITIONS      (RASTER_SIZE * RASTER_SIZE)
-#define NUM_4x4_PARTITIONS      (1U << (g_unitSizeDepth << 1)) // number of 4x4 units in max CU size
 
 #define MIN_PU_SIZE             4
 #define MIN_TU_SIZE             4
diff -r 80c23559084c -r 586d06ad195a source/common/constants.cpp
--- a/source/common/constants.cpp	Wed Jun 21 10:02:45 2017 +0530
+++ b/source/common/constants.cpp	Wed Jun 21 10:03:53 2017 +0530
@@ -161,7 +161,6 @@ const uint16_t x265_chroma_lambda2_offse
     65535
 };
 
-int      g_ctuSizeConfigured = 0;
 uint32_t g_maxLog2CUSize = MAX_LOG2_CU_SIZE;
 uint32_t g_maxCUSize     = MAX_CU_SIZE;
 uint32_t g_unitSizeDepth = NUM_CU_DEPTH;
diff -r 80c23559084c -r 586d06ad195a source/common/constants.h
--- a/source/common/constants.h	Wed Jun 21 10:02:45 2017 +0530
+++ b/source/common/constants.h	Wed Jun 21 10:03:53 2017 +0530
@@ -30,8 +30,6 @@
 namespace X265_NS {
 // private namespace
 
-extern int g_ctuSizeConfigured;
-
 extern double x265_lambda_tab[QP_MAX_MAX + 1];
 extern double x265_lambda2_tab[QP_MAX_MAX + 1];
 extern const uint16_t x265_chroma_lambda2_offset_tab[MAX_CHROMA_LAMBDA_OFFSET + 1];
diff -r 80c23559084c -r 586d06ad195a source/common/cudata.cpp
--- a/source/common/cudata.cpp	Wed Jun 21 10:02:45 2017 +0530
+++ b/source/common/cudata.cpp	Wed Jun 21 10:03:53 2017 +0530
@@ -111,25 +111,23 @@ inline MV scaleMv(MV mv, int scale)
 
 }
 
-cubcast_t CUData::s_partSet[NUM_FULL_DEPTH] = { NULL, NULL, NULL, NULL, NULL };
-uint32_t CUData::s_numPartInCUSize;
-
 CUData::CUData()
 {
     memset(this, 0, sizeof(*this));
 }
 
-void CUData::initialize(const CUDataMemPool& dataPool, uint32_t depth, int csp, int instance)
+void CUData::initialize(const CUDataMemPool& dataPool, uint32_t depth, const x265_param& param, int instance)
 {
+    int csp = param.internalCsp;
     m_chromaFormat  = csp;
     m_hChromaShift  = CHROMA_H_SHIFT(csp);
     m_vChromaShift  = CHROMA_V_SHIFT(csp);
-    m_numPartitions = NUM_4x4_PARTITIONS >> (depth * 2);
+    m_numPartitions = param.num4x4Partitions >> (depth * 2);
 
     if (!s_partSet[0])
     {
-        s_numPartInCUSize = 1 << g_unitSizeDepth;
-        switch (g_maxLog2CUSize)
+        s_numPartInCUSize = 1 << param.unitSizeDepth;
+        switch (param.maxLog2CUSize)
         {
         case 6:
             s_partSet[0] = bcast256;
@@ -221,7 +219,7 @@ void CUData::initialize(const CUDataMemP
 
         m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
 
-        uint32_t cuSize = g_maxCUSize >> depth;
+        uint32_t cuSize = param.maxCUSize >> depth;
         m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (cuSize * cuSize);
         m_trCoeff[1] = m_trCoeff[2] = 0;
         m_transformSkip[1] = m_transformSkip[2] = m_cbf[1] = m_cbf[2] = 0;
@@ -263,7 +261,7 @@ void CUData::initialize(const CUDataMemP
 
         m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
 
-        uint32_t cuSize = g_maxCUSize >> depth;
+        uint32_t cuSize = param.maxCUSize >> depth;
         uint32_t sizeL = cuSize * cuSize;
         uint32_t sizeC = sizeL >> (m_hChromaShift + m_vChromaShift); // block chroma part
         m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (sizeL + sizeC * 2);
@@ -279,17 +277,17 @@ void CUData::initCTU(const Frame& frame,
     m_encData       = frame.m_encData;
     m_slice         = m_encData->m_slice;
     m_cuAddr        = cuAddr;
-    m_cuPelX        = (cuAddr % m_slice->m_sps->numCuInWidth) << g_maxLog2CUSize;
-    m_cuPelY        = (cuAddr / m_slice->m_sps->numCuInWidth) << g_maxLog2CUSize;
+    m_cuPelX        = (cuAddr % m_slice->m_sps->numCuInWidth) << m_slice->m_param->maxLog2CUSize;
+    m_cuPelY        = (cuAddr / m_slice->m_sps->numCuInWidth) << m_slice->m_param->maxLog2CUSize;
     m_absIdxInCTU   = 0;
-    m_numPartitions = NUM_4x4_PARTITIONS;
+    m_numPartitions = m_encData->m_param->num4x4Partitions;
     m_bFirstRowInSlice = (uint8_t)firstRowInSlice;
     m_bLastRowInSlice  = (uint8_t)lastRowInSlice;
     m_bLastCuInSlice   = (uint8_t)lastCuInSlice;
 
     /* sequential memsets */
     m_partSet((uint8_t*)m_qp, (uint8_t)qp);
-    m_partSet(m_log2CUSize,   (uint8_t)g_maxLog2CUSize);
+    m_partSet(m_log2CUSize,   (uint8_t)m_slice->m_param->maxLog2CUSize);
     m_partSet(m_lumaIntraDir, (uint8_t)ALL_IDX);
     m_partSet(m_chromaIntraDir, (uint8_t)ALL_IDX);
     m_partSet(m_tqBypass,     (uint8_t)frame.m_encData->m_param->bLossless);
@@ -391,7 +389,7 @@ void CUData::copyPartFrom(const CUData& 
 
     memcpy(m_distortion + offset, subCU.m_distortion, childGeom.numPartitions * sizeof(sse_t));
 
-    uint32_t tmp = 1 << ((g_maxLog2CUSize - childGeom.depth) * 2);
+    uint32_t tmp = 1 << ((m_slice->m_param->maxLog2CUSize - childGeom.depth) * 2);
     uint32_t tmp2 = subPartIdx * tmp;
     memcpy(m_trCoeff[0] + tmp2, subCU.m_trCoeff[0], sizeof(coeff_t)* tmp);
 
@@ -490,7 +488,7 @@ void CUData::copyToPic(uint32_t depth) c
 
     memcpy(ctu.m_distortion + m_absIdxInCTU, m_distortion, m_numPartitions * sizeof(sse_t));
 
-    uint32_t tmpY = 1 << ((g_maxLog2CUSize - depth) * 2);
+    uint32_t tmpY = 1 << ((m_slice->m_param->maxLog2CUSize - depth) * 2);
     uint32_t tmpY2 = m_absIdxInCTU << (LOG2_UNIT_SIZE * 2);
     memcpy(ctu.m_trCoeff[0] + tmpY2, m_trCoeff[0], sizeof(coeff_t)* tmpY);
 
@@ -569,7 +567,7 @@ void CUData::updatePic(uint32_t depth, i
     m_partCopy(ctu.m_tuDepth + m_absIdxInCTU, m_tuDepth);
     m_partCopy(ctu.m_cbf[0] + m_absIdxInCTU, m_cbf[0]);
 
-    uint32_t tmpY = 1 << ((g_maxLog2CUSize - depth) * 2);
+    uint32_t tmpY = 1 << ((m_slice->m_param->maxLog2CUSize - depth) * 2);
     uint32_t tmpY2 = m_absIdxInCTU << (LOG2_UNIT_SIZE * 2);
     memcpy(ctu.m_trCoeff[0] + tmpY2, m_trCoeff[0], sizeof(coeff_t)* tmpY);
 
@@ -657,7 +655,7 @@ const CUData* CUData::getPUAboveLeft(uin
         return m_cuLeft;
     }
 
-    alPartUnitIdx = NUM_4x4_PARTITIONS - 1;
+    alPartUnitIdx = m_encData->m_param->num4x4Partitions - 1;
     return m_cuAboveLeft;
 }
 
@@ -800,7 +798,7 @@ const CUData* CUData::getPUAboveRightAdi
 /* Get left QpMinCu */
 const CUData* CUData::getQpMinCuLeft(uint32_t& lPartUnitIdx, uint32_t curAbsIdxInCTU) const
 {
-    uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
+    uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
     uint32_t absRorderQpMinCUIdx = g_zscanToRaster[absZorderQpMinCUIdx];
 
     // check for left CTU boundary
@@ -817,7 +815,7 @@ const CUData* CUData::getQpMinCuLeft(uin
 /* Get above QpMinCu */
 const CUData* CUData::getQpMinCuAbove(uint32_t& aPartUnitIdx, uint32_t curAbsIdxInCTU) const
 {
-    uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
+    uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
     uint32_t absRorderQpMinCUIdx = g_zscanToRaster[absZorderQpMinCUIdx];
 
     // check for top CTU boundary
@@ -856,7 +854,7 @@ int CUData::getLastValidPartIdx(int absP
 
 int8_t CUData::getLastCodedQP(uint32_t absPartIdx) const
 {
-    uint32_t quPartIdxMask = 0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2;
+    uint32_t quPartIdxMask = 0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2;
     int lastValidPartIdx = getLastValidPartIdx(absPartIdx & quPartIdxMask);
 
     if (lastValidPartIdx >= 0)
@@ -866,7 +864,7 @@ int8_t CUData::getLastCodedQP(uint32_t a
         if (m_absIdxInCTU)
             return m_encData->getPicCTU(m_cuAddr)->getLastCodedQP(m_absIdxInCTU);
         else if (m_cuAddr > 0 && !(m_slice->m_pps->bEntropyCodingSyncEnabled && !(m_cuAddr % m_slice->m_sps->numCuInWidth)))
-            return m_encData->getPicCTU(m_cuAddr - 1)->getLastCodedQP(NUM_4x4_PARTITIONS);
+            return m_encData->getPicCTU(m_cuAddr - 1)->getLastCodedQP(m_encData->m_param->num4x4Partitions);
         else
             return (int8_t)m_slice->m_sliceQp;
     }
@@ -998,7 +996,7 @@ uint32_t CUData::getCtxSkipFlag(uint32_t
 
 bool CUData::setQPSubCUs(int8_t qp, uint32_t absPartIdx, uint32_t depth)
 {
-    uint32_t curPartNumb = NUM_4x4_PARTITIONS >> (depth << 1);
+    uint32_t curPartNumb = m_encData->m_param->num4x4Partitions >> (depth << 1);
     uint32_t curPartNumQ = curPartNumb >> 2;
 
     if (m_cuDepth[absPartIdx] > depth)
@@ -1624,7 +1622,7 @@ uint32_t CUData::getInterMergeCandidates
                 dir |= (1 << list);
                 candMvField[count][list].mv = colmv;
                 candMvField[count][list].refIdx = refIdx;
-                if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisMode == X265_ANALYSIS_SAVE && m_log2CUSize[0] < 4)
+                if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisReuseMode == X265_ANALYSIS_SAVE && m_log2CUSize[0] < 4)
                 {
                     MV dist(MAX_MV, MAX_MV);
                     candMvField[count][list].mv = dist;
@@ -1789,7 +1787,7 @@ int CUData::getPMV(InterNeighbourMV *nei
             int curRefPOC = m_slice->m_refPOCList[picList][refIdx];
             int curPOC = m_slice->m_poc;
 
-            if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisMode == X265_ANALYSIS_SAVE && (m_log2CUSize[0] < 4))
+            if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisReuseMode == X265_ANALYSIS_SAVE && (m_log2CUSize[0] < 4))
             {
                 MV dist(MAX_MV, MAX_MV);
                 pmv[numMvc++] = amvpCand[num++] = dist;
@@ -1917,10 +1915,10 @@ void CUData::clipMv(MV& outMV) const
     uint32_t offset = 8;
 
     int16_t xmax = (int16_t)((m_slice->m_sps->picWidthInLumaSamples + offset - m_cuPelX - 1) << mvshift);
-    int16_t xmin = -(int16_t)((g_maxCUSize + offset + m_cuPelX - 1) << mvshift);
+    int16_t xmin = -(int16_t)((m_encData->m_param->maxCUSize + offset + m_cuPelX - 1) << mvshift);
 
     int16_t ymax = (int16_t)((m_slice->m_sps->picHeightInLumaSamples + offset - m_cuPelY - 1) << mvshift);
-    int16_t ymin = -(int16_t)((g_maxCUSize + offset + m_cuPelY - 1) << mvshift);
+    int16_t ymin = -(int16_t)((m_encData->m_param->maxCUSize + offset + m_cuPelY - 1) << mvshift);
 
     outMV.x = X265_MIN(xmax, X265_MAX(xmin, outMV.x));
     outMV.y = X265_MIN(ymax, X265_MAX(ymin, outMV.y));