[x265-commits] [x265] rc: fix bugs in using boundary condition for cu while enc...
Aarthi at videolan.org
Aarthi at videolan.org
Sat Sep 20 17:04:48 CEST 2014
details: http://hg.videolan.org/x265/rev/4e17a5c3ed64
branches:
changeset: 8080:4e17a5c3ed64
user: Aarthi Thirumalai
date: Wed Sep 17 17:19:52 2014 +0530
description:
rc: fix bugs in using boundary condition for cu while encoding each frame.
fixes the binary mismatch in 2 pass completely.
Subject: [x265] copy_cnt_4 avx2 asm code: nit, same speedup by sse version
details: http://hg.videolan.org/x265/rev/123db3a255a7
branches:
changeset: 8081:123db3a255a7
user: Praveen Tiwari
date: Thu Sep 11 17:33:44 2014 +0530
description:
copy_cnt_4 avx2 asm code: nit, same speedup by sse version
Subject: [x265] denoiseDct unit test code: fixed bound value problem
details: http://hg.videolan.org/x265/rev/b162185198fe
branches:
changeset: 8082:b162185198fe
user: Praveen Tiwari
date: Wed Sep 17 16:33:52 2014 +0530
description:
denoiseDct unit test code: fixed bound value problem
Subject: [x265] denoiseDct asm code: nit faulty code, need a new SSE version
details: http://hg.videolan.org/x265/rev/55a50a362def
branches:
changeset: 8083:55a50a362def
user: Praveen Tiwari
date: Wed Sep 17 16:45:04 2014 +0530
description:
denoiseDct asm code: nit faulty code, need a new SSE version
Subject: [x265] denoiseDct: nit unused asm function declarations
details: http://hg.videolan.org/x265/rev/54ad38a84a69
branches:
changeset: 8084:54ad38a84a69
user: Praveen Tiwari
date: Wed Sep 17 16:52:15 2014 +0530
description:
denoiseDct: nit unused asm function declarations
Subject: [x265] rc: improvements for cbr
details: http://hg.videolan.org/x265/rev/25dde1ffab66
branches:
changeset: 8085:25dde1ffab66
user: Aarthi Thirumalai
date: Thu Sep 18 18:02:36 2014 +0530
description:
rc: improvements for cbr
Subject: [x265] asm: avx2 assembly code for idct16x16
details: http://hg.videolan.org/x265/rev/7e82d0abf6fb
branches:
changeset: 8086:7e82d0abf6fb
user: Murugan Vairavel <murugan at multicorewareinc.com>
date: Thu Sep 18 14:27:44 2014 +0530
description:
asm: avx2 assembly code for idct16x16
Subject: [x265] denoise_dct asm code: SSE version
details: http://hg.videolan.org/x265/rev/d6759701fdd7
branches:
changeset: 8087:d6759701fdd7
user: Praveen Tiwari
date: Thu Sep 18 15:11:26 2014 +0530
description:
denoise_dct asm code: SSE version
Subject: [x265] denoise_dct: avx2 asm code
details: http://hg.videolan.org/x265/rev/e83cc4a15dc9
branches:
changeset: 8088:e83cc4a15dc9
user: Praveen Tiwari
date: Thu Sep 18 15:30:18 2014 +0530
description:
denoise_dct: avx2 asm code
Subject: [x265] copy_cnt_16: avx2 asm code, improved 514.32 cycles -> 313.66 cycles
details: http://hg.videolan.org/x265/rev/9b672a7b3ea9
branches:
changeset: 8089:9b672a7b3ea9
user: Praveen Tiwari
date: Thu Sep 18 16:37:55 2014 +0530
description:
copy_cnt_16: avx2 asm code, improved 514.32 cycles -> 313.66 cycles
Subject: [x265] copy_cnt_32: avx2 asm code, improved 1521.17 cycles -> 934.46 cycles
details: http://hg.videolan.org/x265/rev/6908388bf26f
branches:
changeset: 8090:6908388bf26f
user: Praveen Tiwari
date: Thu Sep 18 17:02:07 2014 +0530
description:
copy_cnt_32: avx2 asm code, improved 1521.17 cycles -> 934.46 cycles
Subject: [x265] denoiseDct: align performance data while reporting speedup
details: http://hg.videolan.org/x265/rev/4680ab4f92b8
branches:
changeset: 8091:4680ab4f92b8
user: Praveen Tiwari
date: Thu Sep 18 18:16:25 2014 +0530
description:
denoiseDct: align performance data while reporting speedup
Subject: [x265] testbench: allocate test harnesses on heap, for better valgrind coverage
details: http://hg.videolan.org/x265/rev/fa2f1aa1456e
branches:
changeset: 8092:fa2f1aa1456e
user: Steve Borho <steve at borho.org>
date: Wed Sep 17 20:10:25 2014 +0200
description:
testbench: allocate test harnesses on heap, for better valgrind coverage
Subject: [x265] inline simple functions
details: http://hg.videolan.org/x265/rev/c07038ca0e07
branches:
changeset: 8093:c07038ca0e07
user: Satoshi Nakagawa <nakagawa424 at oki.com>
date: Fri Sep 19 09:47:06 2014 +0900
description:
inline simple functions
Subject: [x265] primitives: intra_pred[4][35] => intra_pred[35][4] (avoid *35)
details: http://hg.videolan.org/x265/rev/da61cf406f16
branches:
changeset: 8094:da61cf406f16
user: Satoshi Nakagawa <nakagawa424 at oki.com>
date: Fri Sep 19 16:35:15 2014 +0900
description:
primitives: intra_pred[4][35] => intra_pred[35][4] (avoid *35)
Subject: [x265] search: cleanup and remove redundant variable in checkintra
details: http://hg.videolan.org/x265/rev/7c1e793722f9
branches:
changeset: 8095:7c1e793722f9
user: Gopu Govindaswamy <gopu at multicorewareinc.com>
date: Thu Sep 18 10:50:04 2014 +0530
description:
search: cleanup and remove redundant variable in checkintra
Subject: [x265] search: remove redundant loacal variables in encodeResAndCalcRdSkipCU
details: http://hg.videolan.org/x265/rev/5c067b643591
branches:
changeset: 8096:5c067b643591
user: Gopu Govindaswamy <gopu at multicorewareinc.com>
date: Thu Sep 18 15:01:00 2014 +0530
description:
search: remove redundant loacal variables in encodeResAndCalcRdSkipCU
Subject: [x265] search: simplify and remove redundant variables in getBestIntraModeChroma
details: http://hg.videolan.org/x265/rev/9b9986cc084b
branches:
changeset: 8097:9b9986cc084b
user: Gopu Govindaswamy <gopu at multicorewareinc.com>
date: Thu Sep 18 17:27:15 2014 +0530
description:
search: simplify and remove redundant variables in getBestIntraModeChroma
Subject: [x265] param: do not allow VBV without WPP
details: http://hg.videolan.org/x265/rev/c8f53398f8ce
branches:
changeset: 8098:c8f53398f8ce
user: Steve Borho <steve at borho.org>
date: Sat Sep 20 15:41:08 2014 +0100
description:
param: do not allow VBV without WPP
VBV row restarts cannot function correctly without WPP (per-row CABAC starts)
diffstat:
source/Lib/TLibCommon/TComDataCU.cpp | 36 --
source/Lib/TLibCommon/TComDataCU.h | 16 +-
source/Lib/TLibCommon/TComPattern.cpp | 4 +-
source/Lib/TLibCommon/TComPicYuv.cpp | 4 -
source/Lib/TLibCommon/TComPicYuv.h | 2 +-
source/Lib/TLibCommon/TComRom.cpp | 2 +-
source/Lib/TLibCommon/TComRom.h | 2 +-
source/common/deblock.cpp | 8 +-
source/common/frame.cpp | 3 -
source/common/frame.h | 2 +-
source/common/intrapred.cpp | 24 +-
source/common/param.cpp | 1 +
source/common/primitives.h | 7 +-
source/common/x86/asm-primitives.cpp | 73 ++--
source/common/x86/blockcopy8.asm | 180 +++++--------
source/common/x86/const-a.asm | 3 +-
source/common/x86/dct8.asm | 451 +++++++++++++++++++++++++--------
source/common/x86/dct8.h | 6 +-
source/common/x86/intrapred.h | 8 +-
source/encoder/analysis.cpp | 151 +++++-----
source/encoder/analysis.h | 2 -
source/encoder/encoder.cpp | 4 -
source/encoder/encoder.h | 2 +-
source/encoder/entropy.cpp | 10 -
source/encoder/entropy.h | 6 +-
source/encoder/motion.h | 2 +-
source/encoder/predict.cpp | 13 +-
source/encoder/predict.h | 2 +-
source/encoder/ratecontrol.cpp | 6 +-
source/encoder/search.cpp | 78 ++---
source/encoder/slicetype.cpp | 4 +-
source/test/intrapredharness.cpp | 32 +-
source/test/intrapredharness.h | 2 +-
source/test/mbdstharness.cpp | 3 +-
source/test/testbench.cpp | 22 +-
35 files changed, 645 insertions(+), 526 deletions(-)
diffs (truncated from 2347 to 300 lines):
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComDataCU.cpp
--- a/source/Lib/TLibCommon/TComDataCU.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComDataCU.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -88,9 +88,6 @@ TComDataCU::TComDataCU()
m_DataCUMemPool.m_tqBypassYuvMemBlock = NULL;
}
-TComDataCU::~TComDataCU()
-{}
-
bool TComDataCU::initialize(uint32_t numPartition, uint32_t sizeL, uint32_t sizeC, uint32_t numBlocks, bool isLossless)
{
@@ -1086,15 +1083,6 @@ char TComDataCU::getLastCodedQP(uint32_t
}
}
-/** Check whether the CU is coded in lossless coding mode
- * \param absPartIdx
- * \returns true if the CU is coded in lossless coding mode; false if otherwise
- */
-bool TComDataCU::isLosslessCoded(uint32_t absPartIdx)
-{
- return m_slice->m_pps->bTransquantBypassEnabled && getCUTransquantBypass(absPartIdx);
-}
-
/** Get allowed chroma intra modes
*\param absPartIdx
*\param uiModeList pointer to chroma intra modes array
@@ -1224,11 +1212,6 @@ uint32_t TComDataCU::getCtxSkipFlag(uint
return ctx;
}
-uint32_t TComDataCU::getCtxInterDir(uint32_t absPartIdx)
-{
- return getDepth(absPartIdx);
-}
-
void TComDataCU::clearCbf(uint32_t absPartIdx, uint32_t depth)
{
uint32_t curPartNum = m_pic->getNumPartInCU() >> (depth << 1);
@@ -2111,11 +2094,6 @@ int TComDataCU::fillMvpCand(uint32_t par
return numMvc;
}
-bool TComDataCU::isBipredRestriction()
-{
- return getLog2CUSize(0) == 3 && getPartitionSize(0) != SIZE_2Nx2N;
-}
-
void TComDataCU::clipMv(MV& outMV)
{
int mvshift = 2;
@@ -2130,15 +2108,6 @@ void TComDataCU::clipMv(MV& outMV)
outMV.y = X265_MIN(ymax, X265_MAX(ymin, (int)outMV.y));
}
-/** Test whether the current block is skipped
- * \param partIdx Block index
- * \returns Flag indicating whether the block is skipped
- */
-bool TComDataCU::isSkipped(uint32_t partIdx)
-{
- return getSkipFlag(partIdx);
-}
-
// ====================================================================================================================
// Protected member functions
// ====================================================================================================================
@@ -2438,9 +2407,4 @@ void TComDataCU::getTUEntropyCodingParam
result.firstSignificanceMapContext = bIsLuma ? 21 : 12;
}
-uint32_t TComDataCU::getSCUAddr()
-{
- return (m_cuAddr << g_maxFullDepth * 2) + m_absIdxInLCU;
-}
-
//! \}
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComDataCU.h
--- a/source/Lib/TLibCommon/TComDataCU.h Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComDataCU.h Sat Sep 20 15:41:08 2014 +0100
@@ -248,7 +248,7 @@ protected:
public:
TComDataCU();
- virtual ~TComDataCU();
+ ~TComDataCU() {}
uint32_t m_psyEnergy;
uint64_t m_totalPsyCost;
@@ -290,7 +290,8 @@ public:
uint32_t& getZorderIdxInCU() { return m_absIdxInLCU; }
- uint32_t getSCUAddr();
+ uint32_t getSCUAddr() const { return (m_cuAddr << g_maxFullDepth * 2) + m_absIdxInLCU; }
+
uint32_t getCUPelX() { return m_cuPelX; }
@@ -344,7 +345,7 @@ public:
char getLastCodedQP(uint32_t absPartIdx);
void setQPSubCUs(int qp, TComDataCU* cu, uint32_t absPartIdx, uint32_t depth, bool &foundNonZeroCbf);
- bool isLosslessCoded(uint32_t absPartIdx);
+ bool isLosslessCoded(uint32_t idx) const { return m_cuTransquantBypass[idx] && m_slice->m_pps->bTransquantBypassEnabled; }
uint8_t* getTransformIdx() { return m_trIdx; }
@@ -488,10 +489,9 @@ public:
// member functions for modes
// -------------------------------------------------------------------------------------------------------------------
- bool isIntra(uint32_t partIdx) { return m_predModes[partIdx] == MODE_INTRA; }
-
- bool isSkipped(uint32_t partIdx); ///< SKIP (no residual)
- bool isBipredRestriction();
+ bool isIntra(uint32_t partIdx) const { return m_predModes[partIdx] == MODE_INTRA; }
+ bool isSkipped(uint32_t idx) const { return m_skipFlag[idx]; }
+ bool isBipredRestriction() const { return m_log2CUSize[0] == 3 && m_partSizes[0] != SIZE_2Nx2N; }
// -------------------------------------------------------------------------------------------------------------------
// member functions for symbol prediction (most probable / mode conversion)
@@ -506,7 +506,7 @@ public:
uint32_t getCtxSplitFlag(uint32_t absPartIdx, uint32_t depth);
uint32_t getCtxSkipFlag(uint32_t absPartIdx);
- uint32_t getCtxInterDir(uint32_t absPartIdx);
+ uint32_t getCtxInterDir(uint32_t idx) const { return m_depth[idx]; }
// -------------------------------------------------------------------------------------------------------------------
// member functions for RD cost storage
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComPattern.cpp
--- a/source/Lib/TLibCommon/TComPattern.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComPattern.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -68,9 +68,9 @@ void TComPattern::initAdiPattern(TComDat
fillReferenceSamples(roiOrigin, picStride, adiTemp, intraNeighbors);
- bool bUseFilteredPredictions = (dirMode == ALL_IDX || (g_intraFilterFlags[dirMode] & tuSize));
+ bool bUseFilteredPredictions = (dirMode == ALL_IDX ? (8 | 16 | 32) & tuSize : g_intraFilterFlags[dirMode] & tuSize);
- if (bUseFilteredPredictions && 8 <= tuSize && tuSize <= 32)
+ if (bUseFilteredPredictions)
{
// generate filtered intra prediction samples
// left and left above border + above and above right border + top left corner = length of 3. filter buffer
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComPicYuv.cpp
--- a/source/Lib/TLibCommon/TComPicYuv.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComPicYuv.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -60,10 +60,6 @@ TComPicYuv::TComPicYuv()
m_buOffsetC = NULL;
}
-TComPicYuv::~TComPicYuv()
-{
-}
-
bool TComPicYuv::create(int picWidth, int picHeight, int picCsp, uint32_t maxCUSize, uint32_t maxFullDepth)
{
m_picWidth = picWidth;
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComPicYuv.h
--- a/source/Lib/TLibCommon/TComPicYuv.h Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComPicYuv.h Sat Sep 20 15:41:08 2014 +0100
@@ -94,7 +94,7 @@ public:
int m_numCuInHeight;
TComPicYuv();
- virtual ~TComPicYuv();
+ ~TComPicYuv() {}
// ------------------------------------------------------------------------------------------------
// Memory management
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComRom.cpp
--- a/source/Lib/TLibCommon/TComRom.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComRom.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -497,7 +497,7 @@ const uint8_t x265_exp2_lut[64] =
};
/* g_intraFilterFlags[dir] & trSize */
-const uint8_t g_intraFilterFlags[35] =
+const uint8_t g_intraFilterFlags[NUM_INTRA_MODE] =
{
0x38, 0x00,
0x38, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x20, 0x00, 0x20, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
diff -r 86686bd153db -r c8f53398f8ce source/Lib/TLibCommon/TComRom.h
--- a/source/Lib/TLibCommon/TComRom.h Wed Sep 17 12:52:38 2014 +0200
+++ b/source/Lib/TLibCommon/TComRom.h Sat Sep 20 15:41:08 2014 +0100
@@ -153,7 +153,7 @@ extern const uint8_t g_lpsTable[64][4];
extern const uint8_t x265_exp2_lut[64];
// Intra tables
-extern const uint8_t g_intraFilterFlags[35];
+extern const uint8_t g_intraFilterFlags[NUM_INTRA_MODE];
extern const uint32_t g_depthInc[3][4];
diff -r 86686bd153db -r c8f53398f8ce source/common/deblock.cpp
--- a/source/common/deblock.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/deblock.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -525,8 +525,8 @@ void Deblock::edgeFilterLuma(TComDataCU*
if (cu->m_slice->m_pps->bTransquantBypassEnabled)
{
// check if each of PUs is lossless coded
- partPNoFilter = cuP->isLosslessCoded(partP);
- partQNoFilter = cuQ->isLosslessCoded(partQ);
+ partPNoFilter = cuP->getCUTransquantBypass(partP);
+ partQNoFilter = cuQ->getCUTransquantBypass(partQ);
}
if (d < beta)
@@ -623,8 +623,8 @@ void Deblock::edgeFilterChroma(TComDataC
if (cu->m_slice->m_pps->bTransquantBypassEnabled)
{
// check if each of PUs is lossless coded
- partPNoFilter = cuP->isLosslessCoded(partP);
- partQNoFilter = cuQ->isLosslessCoded(partQ);
+ partPNoFilter = cuP->getCUTransquantBypass(partP);
+ partQNoFilter = cuQ->getCUTransquantBypass(partQ);
}
for (uint32_t chromaIdx = 0; chromaIdx < 2; chromaIdx++)
diff -r 86686bd153db -r c8f53398f8ce source/common/frame.cpp
--- a/source/common/frame.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/frame.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -55,9 +55,6 @@ Frame::Frame()
m_interData = NULL;
}
-Frame::~Frame()
-{}
-
bool Frame::create(x265_param *param, Window& display, Window& conformance)
{
m_conformanceWindow = conformance;
diff -r 86686bd153db -r c8f53398f8ce source/common/frame.h
--- a/source/common/frame.h Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/frame.h Sat Sep 20 15:41:08 2014 +0100
@@ -87,7 +87,7 @@ public:
x265_inter_data* m_interData; // inter analysis information
Frame();
- virtual ~Frame();
+ ~Frame() {}
bool create(x265_param *param, Window& display, Window& conformance);
bool allocPicSym(x265_param *param);
diff -r 86686bd153db -r c8f53398f8ce source/common/intrapred.cpp
--- a/source/common/intrapred.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/intrapred.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -281,22 +281,22 @@ namespace x265 {
void Setup_C_IPredPrimitives(EncoderPrimitives& p)
{
- p.intra_pred[BLOCK_4x4][0] = planar_pred_c<2>;
- p.intra_pred[BLOCK_8x8][0] = planar_pred_c<3>;
- p.intra_pred[BLOCK_16x16][0] = planar_pred_c<4>;
- p.intra_pred[BLOCK_32x32][0] = planar_pred_c<5>;
+ p.intra_pred[0][BLOCK_4x4] = planar_pred_c<2>;
+ p.intra_pred[0][BLOCK_8x8] = planar_pred_c<3>;
+ p.intra_pred[0][BLOCK_16x16] = planar_pred_c<4>;
+ p.intra_pred[0][BLOCK_32x32] = planar_pred_c<5>;
// Intra Prediction DC
- p.intra_pred[BLOCK_4x4][1] = intra_pred_dc_c<4>;
- p.intra_pred[BLOCK_8x8][1] = intra_pred_dc_c<8>;
- p.intra_pred[BLOCK_16x16][1] = intra_pred_dc_c<16>;
- p.intra_pred[BLOCK_32x32][1] = intra_pred_dc_c<32>;
+ p.intra_pred[1][BLOCK_4x4] = intra_pred_dc_c<4>;
+ p.intra_pred[1][BLOCK_8x8] = intra_pred_dc_c<8>;
+ p.intra_pred[1][BLOCK_16x16] = intra_pred_dc_c<16>;
+ p.intra_pred[1][BLOCK_32x32] = intra_pred_dc_c<32>;
for (int i = 2; i < NUM_INTRA_MODE; i++)
{
- p.intra_pred[BLOCK_4x4][i] = intra_pred_ang_c<4>;
- p.intra_pred[BLOCK_8x8][i] = intra_pred_ang_c<8>;
- p.intra_pred[BLOCK_16x16][i] = intra_pred_ang_c<16>;
- p.intra_pred[BLOCK_32x32][i] = intra_pred_ang_c<32>;
+ p.intra_pred[i][BLOCK_4x4] = intra_pred_ang_c<4>;
+ p.intra_pred[i][BLOCK_8x8] = intra_pred_ang_c<8>;
+ p.intra_pred[i][BLOCK_16x16] = intra_pred_ang_c<16>;
+ p.intra_pred[i][BLOCK_32x32] = intra_pred_ang_c<32>;
}
p.intra_pred_allangs[BLOCK_4x4] = all_angs_pred_c<2>;
diff -r 86686bd153db -r c8f53398f8ce source/common/param.cpp
--- a/source/common/param.cpp Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/param.cpp Sat Sep 20 15:41:08 2014 +0100
@@ -963,6 +963,7 @@ int x265_check_params(x265_param *param)
CHECK(param->psyRd < 0 || 2.0 < param->psyRd, "Psy-rd strength must be between 0 and 2.0");
CHECK(param->psyRdoq < 0 || 10.0 < param->psyRdoq, "Psy-rdoq strength must be between 0 and 10.0");
CHECK(param->bEnableWavefront < 0, "WaveFrontSynchro cannot be negative");
+ CHECK(!param->bEnableWavefront && param->rc.vbvBufferSize, "VBV requires wave-front parallelism (--wpp)");
CHECK((param->vui.aspectRatioIdc < 0
|| param->vui.aspectRatioIdc > 16)
&& param->vui.aspectRatioIdc != X265_EXTENDED_SAR,
diff -r 86686bd153db -r c8f53398f8ce source/common/primitives.h
--- a/source/common/primitives.h Wed Sep 17 12:52:38 2014 +0200
+++ b/source/common/primitives.h Sat Sep 20 15:41:08 2014 +0100
More information about the x265-commits
mailing list