[x265-commits] [x265] predict: whitespace nits
Deepthi Nandakumar
deepthi at multicorewareinc.com
Sun Aug 3 19:13:23 CEST 2014
details: http://hg.videolan.org/x265/rev/3db5fda6abf0
branches:
changeset: 7665:3db5fda6abf0
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Fri Aug 01 16:31:20 2014 +0530
description:
predict: whitespace nits
Subject: [x265] cleanup: move m_predYuv and m_predTempYuv from predict to TEncSearch
details: http://hg.videolan.org/x265/rev/a74b24444ae8
branches:
changeset: 7666:a74b24444ae8
user: Santhoshini Sekar <santhoshini at multicorewareinc.com>
date: Fri Aug 01 15:04:36 2014 +0530
description:
cleanup: move m_predYuv and m_predTempYuv from predict to TEncSearch
Subject: [x265] rc: enable abr reset in the first pass of two pass encode.
details: http://hg.videolan.org/x265/rev/a9a7f0933ecc
branches:
changeset: 7667:a9a7f0933ecc
user: Aarthi Thirumalai
date: Fri Aug 01 18:45:57 2014 +0530
description:
rc: enable abr reset in the first pass of two pass encode.
observe this improves second pass results in ultrafast presets for some videos.
Subject: [x265] dpb: remove redundant call to getNalUnitType(), output will not change
details: http://hg.videolan.org/x265/rev/fb24f965eade
branches:
changeset: 7668:fb24f965eade
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 12:12:43 2014 -0500
description:
dpb: remove redundant call to getNalUnitType(), output will not change
Subject: [x265] dpb: getNalUnitType() cannot return NAL_UNIT_CODED_SLICE_IDR_N_LP
details: http://hg.videolan.org/x265/rev/b911b02737c8
branches:
changeset: 7669:b911b02737c8
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 12:13:25 2014 -0500
description:
dpb: getNalUnitType() cannot return NAL_UNIT_CODED_SLICE_IDR_N_LP
Subject: [x265] dpb: style nits
details: http://hg.videolan.org/x265/rev/5d1bd6097113
branches:
changeset: 7670:5d1bd6097113
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 12:17:11 2014 -0500
description:
dpb: style nits
Subject: [x265] dpb: remove checks for slice types we do not emit
details: http://hg.videolan.org/x265/rev/963b8e7b1dff
branches:
changeset: 7671:963b8e7b1dff
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 12:26:34 2014 -0500
description:
dpb: remove checks for slice types we do not emit
Subject: [x265] dpb: cleanup decodingRefreshMarking()
details: http://hg.videolan.org/x265/rev/6b1753638790
branches:
changeset: 7672:6b1753638790
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 12:28:59 2014 -0500
description:
dpb: cleanup decodingRefreshMarking()
Subject: [x265] quant: apply scale factor in just one place
details: http://hg.videolan.org/x265/rev/2a7315a37d67
branches:
changeset: 7673:2a7315a37d67
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 13:13:15 2014 -0500
description:
quant: apply scale factor in just one place
Subject: [x265] quant: delay err3, err4 calculation until/if necessary
details: http://hg.videolan.org/x265/rev/244ba5fa80d4
branches:
changeset: 7674:244ba5fa80d4
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 13:15:30 2014 -0500
description:
quant: delay err3, err4 calculation until/if necessary
Subject: [x265] quant: hoist some calculations out of the loop
details: http://hg.videolan.org/x265/rev/32b4aa0eb4fb
branches:
changeset: 7675:32b4aa0eb4fb
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 13:19:18 2014 -0500
description:
quant: hoist some calculations out of the loop
Subject: [x265] quant: simplify minAbsLevel
details: http://hg.videolan.org/x265/rev/db62272d284c
branches:
changeset: 7676:db62272d284c
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 13:28:34 2014 -0500
description:
quant: simplify minAbsLevel
Subject: [x265] quant: convert getCodedLevel() into a macro, remove m_transformShift hack
details: http://hg.videolan.org/x265/rev/ae8c153ee91d
branches:
changeset: 7677:ae8c153ee91d
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 14:22:16 2014 -0500
description:
quant: convert getCodedLevel() into a macro, remove m_transformShift hack
Subject: [x265] quant: m_lambda2 no longer needs to be a member variable
details: http://hg.videolan.org/x265/rev/287d37822825
branches:
changeset: 7678:287d37822825
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 15:36:34 2014 -0500
description:
quant: m_lambda2 no longer needs to be a member variable
it is only used in rdoQuant() and can be declared on the stack
Subject: [x265] quant: make IEP_RATE an anonymous enum, it doesn't need storage
details: http://hg.videolan.org/x265/rev/be69e059808a
branches:
changeset: 7679:be69e059808a
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 15:37:06 2014 -0500
description:
quant: make IEP_RATE an anonymous enum, it doesn't need storage
Subject: [x265] quant: support scaling lists in psy-rdoq
details: http://hg.videolan.org/x265/rev/8767ddb686af
branches:
changeset: 7680:8767ddb686af
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 15:44:35 2014 -0500
description:
quant: support scaling lists in psy-rdoq
Subject: [x265] quant: rename costCoeff0 to costUncoded, add docs
details: http://hg.videolan.org/x265/rev/1c9a6a976e5d
branches:
changeset: 7681:1c9a6a976e5d
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 17:00:43 2014 -0500
description:
quant: rename costCoeff0 to costUncoded, add docs
Subject: [x265] quant: clarify last-nz optimization loop
details: http://hg.videolan.org/x265/rev/11a3a69d3e29
branches:
changeset: 7682:11a3a69d3e29
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 17:37:44 2014 -0500
description:
quant: clarify last-nz optimization loop
Subject: [x265] quant: correct rounding factor for unquant
details: http://hg.videolan.org/x265/rev/253ad3eafaa2
branches:
changeset: 7683:253ad3eafaa2
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 18:07:21 2014 -0500
description:
quant: correct rounding factor for unquant
Subject: [x265] quant: blockUncodedCost -> totalUncodedCost, improve comments
details: http://hg.videolan.org/x265/rev/3b8853b12d9c
branches:
changeset: 7684:3b8853b12d9c
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 18:08:04 2014 -0500
description:
quant: blockUncodedCost -> totalUncodedCost, improve comments
Subject: [x265] quant: remove redundant level intialization
details: http://hg.videolan.org/x265/rev/d341acd13af2
branches:
changeset: 7685:d341acd13af2
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 18:09:07 2014 -0500
description:
quant: remove redundant level intialization
Subject: [x265] quant: improve comments for trailing zero coeff
details: http://hg.videolan.org/x265/rev/f14d233107d4
branches:
changeset: 7686:f14d233107d4
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 18:13:40 2014 -0500
description:
quant: improve comments for trailing zero coeff
Subject: [x265] quant: more readability nits - no output changes
details: http://hg.videolan.org/x265/rev/ed49f875ab20
branches:
changeset: 7687:ed49f875ab20
user: Steve Borho <steve at borho.org>
date: Fri Aug 01 18:28:08 2014 -0500
description:
quant: more readability nits - no output changes
Subject: [x265] quant: re-order rdoq logic so only one RDO_CODED_LEVEL() call is required
details: http://hg.videolan.org/x265/rev/30f1f1d739db
branches:
changeset: 7688:30f1f1d739db
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 08:58:46 2014 -0500
description:
quant: re-order rdoq logic so only one RDO_CODED_LEVEL() call is required
Subject: [x265] quant: RDO_CODED_LEVEL macro can now be inlined for easier debugging
details: http://hg.videolan.org/x265/rev/9bb93a267300
branches:
changeset: 7689:9bb93a267300
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 09:09:14 2014 -0500
description:
quant: RDO_CODED_LEVEL macro can now be inlined for easier debugging
Subject: [x265] quant: rename sigCost to codedSigBits, comment nit
details: http://hg.videolan.org/x265/rev/a28d5ae1b52a
branches:
changeset: 7690:a28d5ae1b52a
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 09:09:22 2014 -0500
description:
quant: rename sigCost to codedSigBits, comment nit
Subject: [x265] quant: levelDouble -> levelScaled
details: http://hg.videolan.org/x265/rev/28c35f8e4f43
branches:
changeset: 7691:28c35f8e4f43
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 09:17:50 2014 -0500
description:
quant: levelDouble -> levelScaled
This always confused the heck out of me. The level was not doubled, it was not
a double, and it wasn't squared. It was just the level scaled by the quant
scale factor
Subject: [x265] quant: consistent comment style, improve comments
details: http://hg.videolan.org/x265/rev/b12ac8919761
branches:
changeset: 7692:b12ac8919761
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 10:14:59 2014 -0500
description:
quant: consistent comment style, improve comments
Subject: [x265] quant: change lastCG into a bool, use isOne flag to avoid abs() calls
details: http://hg.videolan.org/x265/rev/69beab744475
branches:
changeset: 7693:69beab744475
user: Steve Borho <steve at borho.org>
date: Sat Aug 02 10:41:30 2014 -0500
description:
quant: change lastCG into a bool, use isOne flag to avoid abs() calls
Subject: [x265] update header and support Intel IACA marker
details: http://hg.videolan.org/x265/rev/e6184896aa7b
branches:
changeset: 7694:e6184896aa7b
user: Min Chen <chenm003 at 163.com>
date: Fri Aug 01 17:56:11 2014 -0700
description:
update header and support Intel IACA marker
Subject: [x265] asm: cvt16to32_cnt[4x4] for TSkip
details: http://hg.videolan.org/x265/rev/6f502ab94357
branches:
changeset: 7695:6f502ab94357
user: Min Chen <chenm003 at 163.com>
date: Fri Aug 01 17:56:27 2014 -0700
description:
asm: cvt16to32_cnt[4x4] for TSkip
Subject: [x265] asm: cvt16to32_cnt[8x8] for TSkip
details: http://hg.videolan.org/x265/rev/49bab9bdf2a3
branches:
changeset: 7696:49bab9bdf2a3
user: Min Chen <chenm003 at 163.com>
date: Fri Aug 01 17:56:37 2014 -0700
description:
asm: cvt16to32_cnt[8x8] for TSkip
diffstat:
source/Lib/TLibEncoder/TEncSearch.cpp | 13 +-
source/Lib/TLibEncoder/TEncSearch.h | 1 +
source/common/dct.cpp | 21 +
source/common/primitives.h | 2 +
source/common/quant.cpp | 368 +++++++++++++++------------------
source/common/quant.h | 13 +-
source/common/slice.h | 7 +-
source/common/x86/asm-primitives.cpp | 6 +
source/common/x86/blockcopy8.asm | 232 +++++++++++++++++++++-
source/common/x86/blockcopy8.h | 8 +
source/common/x86/const-a.asm | 1 +
source/common/x86/x86inc.asm | 12 +
source/encoder/dpb.cpp | 54 +---
source/encoder/predict.cpp | 6 -
source/encoder/predict.h | 8 +-
source/encoder/ratecontrol.cpp | 4 +-
source/test/pixelharness.cpp | 44 ++++
source/test/pixelharness.h | 1 +
18 files changed, 527 insertions(+), 274 deletions(-)
diffs (truncated from 1450 to 300 lines):
diff -r e85b0aaa64e4 -r 49bab9bdf2a3 source/Lib/TLibEncoder/TEncSearch.cpp
--- a/source/Lib/TLibEncoder/TEncSearch.cpp Thu Jul 31 11:08:02 2014 +0530
+++ b/source/Lib/TLibEncoder/TEncSearch.cpp Fri Aug 01 17:56:37 2014 -0700
@@ -77,6 +77,7 @@ TEncSearch::~TEncSearch()
X265_FREE(m_qtTempTrIdx);
X265_FREE(m_qtTempCbf[0]);
X265_FREE(m_qtTempTransformSkipFlag[0]);
+ m_predTempYuv.destroy();
delete[] m_qtTempShortYuv;
}
@@ -92,6 +93,7 @@ bool TEncSearch::initSearch(Encoder& top
m_numLayers = top.m_quadtreeTULog2MaxSize - 2 + 1;
initTempBuff(m_param->internalCsp);
+ ok &= m_predTempYuv.create(MAX_CU_SIZE, MAX_CU_SIZE, m_param->internalCsp);
m_me.setSearchMethod(m_param->searchMethod);
m_me.setSubpelRefine(m_param->subpelRefine);
@@ -107,7 +109,7 @@ bool TEncSearch::initSearch(Encoder& top
m_qtTempCoeff[0][i] = X265_MALLOC(coeff_t, sizeL + sizeC * 2);
m_qtTempCoeff[1][i] = m_qtTempCoeff[0][i] + sizeL;
m_qtTempCoeff[2][i] = m_qtTempCoeff[0][i] + sizeL + sizeC;
- m_qtTempShortYuv[i].create(MAX_CU_SIZE, MAX_CU_SIZE, m_param->internalCsp);
+ ok &= m_qtTempShortYuv[i].create(MAX_CU_SIZE, MAX_CU_SIZE, m_param->internalCsp);
}
const uint32_t numPartitions = 1 << (g_maxCUDepth << 1);
@@ -1894,6 +1896,7 @@ bool TEncSearch::predInterSearch(TComDat
int numPredDir = cu->m_slice->isInterP() ? 1 : 2;
uint32_t lastMode = 0;
int totalmebits = 0;
+ TComYuv m_predYuv[2];
const int* numRefIdx = cu->m_slice->m_numRefIdx;
@@ -1901,6 +1904,9 @@ bool TEncSearch::predInterSearch(TComDat
memset(&merge, 0, sizeof(merge));
+ m_predYuv[0].create(MAX_CU_SIZE, MAX_CU_SIZE, m_param->internalCsp);
+ m_predYuv[1].create(MAX_CU_SIZE, MAX_CU_SIZE, m_param->internalCsp);
+
for (int partIdx = 0; partIdx < numPart; partIdx++)
{
uint32_t partAddr;
@@ -1936,7 +1942,7 @@ bool TEncSearch::predInterSearch(TComDat
cu->getCUMvField(REF_PIC_LIST_1)->setAllMvField(merge.mvField[1], partSize, partAddr, 0, partIdx);
totalmebits += merge.bits;
- prepMotionCompensation(cu, partIdx);
+ prepMotionCompensation(cu, partIdx);
motionCompensation(cu, predYuv, REF_PIC_LIST_X, true, bChroma);
continue;
}
@@ -2159,6 +2165,9 @@ bool TEncSearch::predInterSearch(TComDat
motionCompensation(cu, predYuv, REF_PIC_LIST_X, true, bChroma);
}
+ m_predYuv[0].destroy();
+ m_predYuv[1].destroy();
+
x265_emms();
cu->m_totalBits = totalmebits;
return true;
diff -r e85b0aaa64e4 -r 49bab9bdf2a3 source/Lib/TLibEncoder/TEncSearch.h
--- a/source/Lib/TLibEncoder/TEncSearch.h Thu Jul 31 11:08:02 2014 +0530
+++ b/source/Lib/TLibEncoder/TEncSearch.h Fri Aug 01 17:56:37 2014 -0700
@@ -106,6 +106,7 @@ public:
MotionReference (*m_mref)[MAX_NUM_REF + 1];
ShortYuv* m_qtTempShortYuv;
+ TComYuv m_predTempYuv;
coeff_t* m_qtTempCoeff[3][NUM_LAYERS];
uint8_t* m_qtTempTrIdx;
diff -r e85b0aaa64e4 -r 49bab9bdf2a3 source/common/dct.cpp
--- a/source/common/dct.cpp Thu Jul 31 11:08:02 2014 +0530
+++ b/source/common/dct.cpp Fri Aug 01 17:56:37 2014 -0700
@@ -830,6 +830,22 @@ int count_nonzero_c(const int32_t *quan
return count;
}
+
+template<int trSize>
+uint32_t conv16to32_count(coeff_t* coeff, int16_t* residual, intptr_t stride)
+{
+ uint32_t numSig = 0;
+ for (int k = 0; k < trSize; k++)
+ {
+ for (int j = 0; j < trSize; j++)
+ {
+ coeff[k * trSize + j] = ((int16_t)residual[k * stride + j]);
+ numSig += (residual[k * stride + j] != 0);
+ }
+ }
+
+ return numSig;
+}
} // closing - anonymous file-static namespace
namespace x265 {
@@ -852,5 +868,10 @@ void Setup_C_DCTPrimitives(EncoderPrimit
p.idct[IDCT_16x16] = idct16_c;
p.idct[IDCT_32x32] = idct32_c;
p.count_nonzero = count_nonzero_c;
+
+ p.cvt16to32_cnt[BLOCK_4x4] = conv16to32_count<4>;
+ p.cvt16to32_cnt[BLOCK_8x8] = conv16to32_count<8>;
+ p.cvt16to32_cnt[BLOCK_16x16] = conv16to32_count<16>;
+ p.cvt16to32_cnt[BLOCK_32x32] = conv16to32_count<32>;
}
}
diff -r e85b0aaa64e4 -r 49bab9bdf2a3 source/common/primitives.h
--- a/source/common/primitives.h Thu Jul 31 11:08:02 2014 +0530
+++ b/source/common/primitives.h Fri Aug 01 17:56:37 2014 -0700
@@ -150,6 +150,7 @@ typedef void (*intra_allangs_t)(pixel *d
typedef void (*cvt16to32_shl_t)(int32_t *dst, int16_t *src, intptr_t, int, int);
typedef void (*cvt32to16_shr_t)(int16_t *dst, int32_t *src, intptr_t, int, int);
+typedef uint32_t (*cvt16to32_cnt_t)(coeff_t* coeff, int16_t* residual, intptr_t stride);
typedef void (*dct_t)(int16_t *src, int32_t *dst, intptr_t stride);
typedef void (*idct_t)(int32_t *src, int16_t *dst, intptr_t stride);
@@ -218,6 +219,7 @@ struct EncoderPrimitives
blockcpy_ps_t blockcpy_ps; // block copy pixel from short
cvt16to32_shl_t cvt16to32_shl;
cvt32to16_shr_t cvt32to16_shr;
+ cvt16to32_cnt_t cvt16to32_cnt[NUM_SQUARE_BLOCKS - 1];
copy_pp_t luma_copy_pp[NUM_LUMA_PARTITIONS];
copy_sp_t luma_copy_sp[NUM_LUMA_PARTITIONS];
diff -r e85b0aaa64e4 -r 49bab9bdf2a3 source/common/quant.cpp
--- a/source/common/quant.cpp Thu Jul 31 11:08:02 2014 +0530
+++ b/source/common/quant.cpp Fri Aug 01 17:56:37 2014 -0700
@@ -219,8 +219,7 @@ void Quant::setQPforQuant(int qpy, TextT
uint32_t Quant::signBitHidingHDQ(coeff_t* qCoef, coeff_t* coef, int32_t* deltaU, uint32_t numSig, const TUEntropyCodingParameters &codingParameters)
{
const uint32_t log2TrSizeCG = codingParameters.log2TrSizeCG;
-
- int lastCG = 1;
+ bool lastCG = true;
for (int subSet = (1 << log2TrSizeCG * 2) - 1; subSet >= 0; subSet--)
{
@@ -253,7 +252,7 @@ uint32_t Quant::signBitHidingHDQ(coeff_t
{
int minCostInc = MAX_INT, minPos = -1, finalChange = 0, curCost = MAX_INT, curChange = 0;
- for (n = (lastCG == 1 ? lastNZPosInCG : SCAN_SET_SIZE - 1); n >= 0; --n)
+ for (n = (lastCG ? lastNZPosInCG : SCAN_SET_SIZE - 1); n >= 0; --n)
{
uint32_t blkPos = codingParameters.scan[n + subPos];
if (qCoef[blkPos])
@@ -317,7 +316,7 @@ uint32_t Quant::signBitHidingHDQ(coeff_t
}
}
- lastCG = 0;
+ lastCG = false;
}
return numSig;
@@ -365,17 +364,8 @@ uint32_t Quant::transformNxN(TComDataCU*
int trSize = 1 << log2TrSize;
if (cu->getCUTransquantBypass(absPartIdx))
{
- uint32_t numSig = 0;
- for (int k = 0; k < trSize; k++)
- {
- for (int j = 0; j < trSize; j++)
- {
- coeff[k * trSize + j] = ((int16_t)residual[k * stride + j]);
- numSig += (residual[k * stride + j] != 0);
- }
- }
-
- return numSig;
+ X265_CHECK(log2TrSize >= 2 && log2TrSize <= 5, "Block size mistake!\n");
+ return primitives.cvt16to32_cnt[log2TrSize - 2](coeff, residual, stride);
}
X265_CHECK((cu->m_slice->m_sps->quadtreeTULog2MaxSize >= log2TrSize), "transform size too large\n");
@@ -502,7 +492,6 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
uint32_t trSize = 1 << log2TrSize;
int transformShift = MAX_TR_DYNAMIC_RANGE - X265_DEPTH - log2TrSize; // Represents scaling through forward transform
int scalingListType = (cu->isIntra(absPartIdx) ? 0 : 3) + ttype;
- m_transformShift = transformShift;
X265_CHECK(scalingListType < 6, "scaling list type out of range\n");
@@ -521,25 +510,31 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
return 0;
x265_emms();
- selectLambda(ttype);
+ /* unquant constants for psy-rdoq */
+ int32_t *unquantScale = m_scalingList->m_dequantCoef[log2TrSize - 2][scalingListType][rem];
+ int unquantShift = QUANT_IQUANT_SHIFT - QUANT_SHIFT - transformShift;
+ int unquantRound = 1 << (unquantShift - 1);
+ int scaleBits = SCALE_BITS - 2 * transformShift;
+
+ double lambda2 = m_lambdas[ttype];
double *errScale = m_scalingList->m_errScale[log2TrSize - 2][scalingListType][rem];
bool bIsLuma = ttype == TEXT_LUMA;
bool usePsy = m_psyRdoqScale && bIsLuma;
- double blockUncodedCost = 0;
- double costCoeff[32 * 32];
- double costSig[32 * 32];
- double costCoeff0[32 * 32];
+ double totalUncodedCost = 0;
+ double costCoeff[32 * 32]; /* d*d + lambda * bits */
+ double costUncoded[32 * 32]; /* d*d + lambda * 0 */
+ double costSig[32 * 32]; /* lambda * bits */
- int rateIncUp[32 * 32];
- int rateIncDown[32 * 32];
- int sigRateDelta[32 * 32];
+ int rateIncUp[32 * 32]; /* signal overhead of increasing level */
+ int rateIncDown[32 * 32]; /* signal overhead of decreasing level */
+ int sigRateDelta[32 * 32]; /* signal difference between zero and non-zero */
int deltaU[32 * 32];
- const uint32_t cgSize = (1 << MLS_CG_SIZE); // 4x4 coef = 16
- double costCoeffGroupSig[MLS_GRP_NUM]; // 32x32 has 64 4x4 coding groups
+ double costCoeffGroupSig[MLS_GRP_NUM]; /* lambda * bits of group coding cost */
uint64_t sigCoeffGroupFlag64 = 0;
+
uint32_t ctxSet = 0;
int c1 = 1;
int c2 = 0;
@@ -549,6 +544,7 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
uint32_t c1Idx = 0;
uint32_t c2Idx = 0;
int cgLastScanPos = -1;
+ const uint32_t cgSize = (1 << MLS_CG_SIZE); /* 4x4 num coef = 16 */
TUEntropyCodingParameters codingParameters;
cu->getTUEntropyCodingParameters(codingParameters, absPartIdx, log2TrSize, bIsLuma);
@@ -557,6 +553,7 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
uint32_t scanPos;
coeffGroupRDStats rdStats;
+ /* iterate over coding groups in reverse scan order */
for (int cgScanPos = cgNum - 1; cgScanPos >= 0; cgScanPos--)
{
const uint32_t cgBlkPos = codingParameters.scanCG[cgScanPos];
@@ -567,24 +564,24 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
const int patternSigCtx = calcPatternSigCtx(sigCoeffGroupFlag64, cgPosX, cgPosY, codingParameters.log2TrSizeCG);
+ /* iterate over coefficients in each group in reverse scan order */
for (int scanPosinCG = cgSize - 1; scanPosinCG >= 0; scanPosinCG--)
{
scanPos = (cgScanPos << MLS_CG_SIZE) + scanPosinCG;
uint32_t blkPos = codingParameters.scan[scanPos];
- double scaleFactor = errScale[blkPos];
- int levelDouble = scaledCoeff[blkPos]; /* abs(coef) * quantCoef */
+ double scaleFactor = errScale[blkPos]; /* (1 << scaleBits) / (quantCoef * quantCoef) */
+ int levelScaled = scaledCoeff[blkPos]; /* abs(coef) * quantCoef */
uint32_t maxAbsLevel = abs(dstCoeff[blkPos]); /* abs(coef) */
- /* initial cost of each coefficient. This works out to be:
- * abs(coef) * quantCoef * abs(coef) * quantCoef * (scalingBits / (quantCoef * quantCoef))
- * which reduces to abs(coef) * abs(coef) * scalingBits, which should be reduced
- * even further to abs(coef) * abs(coef) << scalingBits in the future */
- costCoeff0[scanPos] = ((uint64_t)levelDouble * levelDouble) * scaleFactor;
+ /* RDOQ measures distortion as the scaled level squared times a
+ * scale factor which tries to remove the quantCoef back out, but
+ * adds scaleBits to account for IEP_RATE which is 32k (1 << SCALE_BITS) */
- /* running total of initial coeff L2 cost without accounting for lambda */
- blockUncodedCost += costCoeff0[scanPos];
+ /* cost of not coding this coefficient (no signal bits) */
+ costUncoded[scanPos] = ((uint64_t)levelScaled * levelScaled) * scaleFactor;
+ totalUncodedCost += costUncoded[scanPos];
- if (maxAbsLevel > 0 && lastScanPos < 0)
+ if (maxAbsLevel && lastScanPos < 0)
{
/* remember the first non-zero coef found in this reverse scan as the last pos */
lastScanPos = scanPos;
@@ -592,7 +589,15 @@ uint32_t Quant::rdoQuant(TComDataCU* cu,
cgLastScanPos = cgScanPos;
}
- if (lastScanPos >= 0)
+ if (lastScanPos < 0)
+ {
+ /* No non-zero coefficient yet found, but this does not mean
+ * there is no uncoded-cost for this coefficient. Pre-
+ * quantization the coefficient may have been non-zero */
+ costCoeff[scanPos] = 0;
+ baseCost += costUncoded[scanPos];
+ }
+ else
{
const uint32_t c1c2Idx = ((c1Idx - 8) >> (sizeof(int) * CHAR_BIT - 1)) + (((-(int)c2Idx) >> (sizeof(int) * CHAR_BIT - 1)) + 1) * 2;
More information about the x265-commits
mailing list