<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jan 29, 2020 at 12:20 PM <<a href="mailto:srikanth.kurapati@multicorewareinc.com">srikanth.kurapati@multicorewareinc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"># HG changeset patch<br>
# User Srikanth Kurapati<br>
# Date 1580280547 -19800<br>
# Wed Jan 29 12:19:07 2020 +0530<br>
# Node ID e9c8c0089bddc9e9e47774b5fda1f4dff1fb45e4<br>
# Parent fdbd4e4a2aff93bfc14b10efcd9e681a7ebae311<br>
Edge Aware Quad Tree Establishment.<br>
<br>
This patch does the following:<br>
1. Terminates recursion using edge information.<br>
2. Adds modes for "--rskip". Modes 0,1 for current usage and 2,3 for edge based<br>
rskips for RD levels 0 to 6.<br></blockquote><div>[KS] Since there are only 0 to 6 levels, should we mention? </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
3. Adds option "edge-threshold" to decide recursion skip using CU edge density.<br>
4. Re uses edge information when already available in encoder.<br>
<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd doc/reST/cli.rst<br>
--- a/doc/reST/cli.rst Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/doc/reST/cli.rst Wed Jan 29 12:19:07 2020 +0530<br>
@@ -842,15 +842,31 @@<br>
Measure 2Nx2N merge candidates first; if no residual is found, <br>
additional modes at that depth are not analysed. Default disabled<br>
<br>
-.. option:: --rskip, --no-rskip<br>
-<br>
- This option determines early exit from CU depth recursion. When a skip CU is<br>
- found, additional heuristics (depending on rd-level) are used to decide whether<br>
- to terminate recursion. In rdlevels 5 and 6, comparison with inter2Nx2N is used, <br>
- while at rdlevels 4 and neighbour costs are used to skip recursion.<br>
- Provides minimal quality degradation at good performance gains when enabled. <br>
-<br>
- Default: enabled, disabled for :option:`--tune grain`<br>
+.. option:: --rskip <0|1|2|3><br>
+<br>
+ This option determines early exit from CU depth recursion in modes 1, 2 and 3. When a skip CU is<br>
+ found, additional heuristics (depending on RD level and rskip mode) are used to decide whether<br>
+ to terminate recursion. The following table summarizes the behavior.<br>
+ <br>
+ +----------+------------+----------------------------------------------------------------+<br>
+ | RD Level | Rskip Mode | Skip Recursion Heuristic |<br>
+ +==========+============+================================================================+<br>
+ | 0 - 4 | 1 | Neighbour costs. |<br>
+ +----------+------------+----------------------------------------------------------------+<br>
+ | 5 - 6 | 1 | Comparison with inter2Nx2N. |<br>
+ +----------+------------+----------------------------------------------------------------+<br>
+ | 0 - 6 | 2 | CU edge denstiy. |<br>
+ +----------+------------+----------------------------------------------------------------+<br>
+ | 0 - 6 | 3 | CU edge denstiy with forceful skip for lower levels of CTU. |<br>
+ +----------+------------+----------------------------------------------------------------+<br>
+ <br>
+ Provides minimal quality degradation at good performance gains for non-zero modes.<br>
+ :option:`--r-skip mode 0` means disabled. Default: 1, disabled when :option:`--tune grain` is used.<br></blockquote><div>[KS] CLI option is --rskip, not --r-skip</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+<br>
+.. option:: --edge-threshold <0..100><br>
+<br>
+ Denotes the minimum expected edge-density percentage within the CU, below which the recursion is skipped.<br>
+ Default: 5, requires :option:`--rskip mode 2|3` to be enabled.<br>
<br>
.. option:: --splitrd-skip, --no-splitrd-skip<br>
<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/CMakeLists.txt<br>
--- a/source/CMakeLists.txt Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/CMakeLists.txt Wed Jan 29 12:19:07 2020 +0530<br>
@@ -29,7 +29,7 @@<br>
option(STATIC_LINK_CRT "Statically link C runtime for release builds" OFF)<br>
mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)<br>
# X265_BUILD must be incremented each time the public API is changed<br>
-set(X265_BUILD 188)<br>
+set(X265_BUILD 189)<br>
configure_file("${PROJECT_SOURCE_DIR}/<a href="http://x265.def.in" rel="noreferrer" target="_blank">x265.def.in</a>"<br>
"${PROJECT_BINARY_DIR}/x265.def")<br>
configure_file("${PROJECT_SOURCE_DIR}/<a href="http://x265_config.h.in" rel="noreferrer" target="_blank">x265_config.h.in</a>"<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/common.h<br>
--- a/source/common/common.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/common.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -129,6 +129,7 @@<br>
typedef uint64_t sum2_t;<br>
typedef uint64_t pixel4;<br>
typedef int64_t ssum2_t;<br>
+#define SHIFT_TO_BITPLANE 9<br>
#define HISTOGRAM_BINS 1024<br>
#define SHIFT 1<br>
#else<br>
@@ -137,6 +138,7 @@<br>
typedef uint32_t sum2_t;<br>
typedef uint32_t pixel4;<br>
typedef int32_t ssum2_t; // Signed sum<br>
+#define SHIFT_TO_BITPLANE 7<br>
#define HISTOGRAM_BINS 256<br>
#define SHIFT 0<br>
#endif // if HIGH_BIT_DEPTH<br>
@@ -272,6 +274,9 @@<br>
#define MAX_TR_SIZE (1 << MAX_LOG2_TR_SIZE)<br>
#define MAX_TS_SIZE (1 << MAX_LOG2_TS_SIZE)<br>
<br>
+#define RDCOST_BASED_RSKIP 1<br>
+#define EDGE_BASED_RSKIP 2<br>
+<br>
#define COEF_REMAIN_BIN_REDUCTION 3 // indicates the level at which the VLC<br>
// transitions from Golomb-Rice to TU+EG(k)<br>
<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/frame.cpp<br>
--- a/source/common/frame.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/frame.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -61,6 +61,8 @@<br>
m_edgePic = NULL;<br>
m_gaussianPic = NULL;<br>
m_thetaPic = NULL;<br>
+ m_edgeBitPlane = NULL;<br>
+ m_edgeBitPic = NULL;<br>
}<br>
<br>
bool Frame::create(x265_param *param, float* quantOffsets)<br>
@@ -115,6 +117,19 @@<br>
m_thetaPic = X265_MALLOC(pixel, m_stride * (maxHeight + (m_lumaMarginY * 2)));<br>
}<br>
<br>
+ if (param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
+ uint32_t numCuInWidth = (param->sourceWidth + param->maxCUSize - 1) / param->maxCUSize;<br>
+ uint32_t numCuInHeight = (param->sourceHeight + param->maxCUSize - 1) / param->maxCUSize;<br>
+ uint32_t lumaMarginX = param->maxCUSize + 32;<br>
+ uint32_t lumaMarginY = param->maxCUSize + 16;<br>
+ uint32_t stride = (numCuInWidth * param->maxCUSize) + (lumaMarginX << 1);<br>
+ uint32_t maxHeight = numCuInHeight * param->maxCUSize;<br>
+ m_bitPlaneSize = stride * (maxHeight + (lumaMarginY * 2));<br>
+ CHECKED_MALLOC_ZERO(m_edgeBitPlane, pixel, m_bitPlaneSize);<br>
+ m_edgeBitPic = m_edgeBitPlane + lumaMarginY * stride + lumaMarginX;<br>
+ }<br>
+<br></blockquote><div>[KS] We do malloc_zero here including the margins instead of copying the last row/column values to the padding. IIRC, copying last row/column values has some significance for interpolation/MC. Can you verify if this is right to do?</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
if (m_fencPic->create(param, !!m_param->bCopyPicToFrame) && m_lowres.create(param, m_fencPic, param->rc.qgSize))<br>
{<br>
X265_CHECK((m_reconColCount == NULL), "m_reconColCount was initialized");<br>
@@ -267,4 +282,10 @@<br>
X265_FREE(m_gaussianPic);<br>
X265_FREE(m_thetaPic);<br>
}<br>
+<br>
+ if (m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
+ X265_FREE_ZERO(m_edgeBitPlane);<br>
+ m_edgeBitPic = NULL;<br>
+ }<br>
}<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/frame.h<br>
--- a/source/common/frame.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/frame.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -137,6 +137,9 @@<br>
pixel* m_gaussianPic;<br>
pixel* m_thetaPic;<br>
<br>
+ pixel* m_edgeBitPlane;<br>
+ pixel* m_edgeBitPic;<br></blockquote><div>[KS] Do we need 2 pointers? m_edgeBitPlane is used only for allocation/freeing, in all other places only edgepic is used. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+ uint32_t m_bitPlaneSize;<br></blockquote><div>[KS] when bitPlaneSize can be a local variable, what is the significance for making it part of Frame? </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Frame();<br>
<br>
bool create(x265_param *param, float* quantOffsets);<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/param.cpp<br>
--- a/source/common/param.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/param.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -199,6 +199,7 @@<br>
param->bEnableWeightedBiPred = 0;<br>
param->bEnableEarlySkip = 1;<br>
param->bEnableRecursionSkip = 1;<br>
+ param->edgeThreshold = 0.05f;<br>
param->bEnableAMP = 0;<br>
param->bEnableRectInter = 0;<br>
param->rdLevel = 3;<br>
@@ -702,8 +703,9 @@<br>
OPT("ref") p->maxNumReferences = atoi(value);<br>
OPT("fast-intra") p->bEnableFastIntra = atobool(value);<br>
OPT("early-skip") p->bEnableEarlySkip = atobool(value);<br>
- OPT("rskip") p->bEnableRecursionSkip = atobool(value);<br>
- OPT("me")p->searchMethod = parseName(value, x265_motion_est_names, bError);<br>
+ OPT("rskip") p->bEnableRecursionSkip = atoi(value);<br>
+ OPT("edge-threshold") p->edgeThreshold = atoi(value)/100.0f;<br>
+ OPT("me") p->searchMethod = parseName(value, x265_motion_est_names, bError);<br>
OPT("subme") p->subpelRefine = atoi(value);<br>
OPT("merange") p->searchRange = atoi(value);<br>
OPT("rect") p->bEnableRectInter = atobool(value);<br>
@@ -919,7 +921,7 @@<br>
OPT("max-merge") p->maxNumMergeCand = (uint32_t)atoi(value);<br>
OPT("temporal-mvp") p->bEnableTemporalMvp = atobool(value);<br>
OPT("early-skip") p->bEnableEarlySkip = atobool(value);<br>
- OPT("rskip") p->bEnableRecursionSkip = atobool(value);<br>
+ OPT("rskip") p->bEnableRecursionSkip = atoi(value);<br>
OPT("rdpenalty") p->rdPenalty = atoi(value);<br>
OPT("tskip") p->bEnableTransformSkip = atobool(value);<br>
OPT("no-tskip-fast") p->bEnableTSkipFast = atobool(value);<br>
@@ -1221,6 +1223,7 @@<br>
}<br>
}<br>
OPT("hist-threshold") p->edgeTransitionThreshold = atof(value);<br>
+ OPT("edge-threshold") p->edgeThreshold = atoi(value)/100.0f;<br>
OPT("lookahead-threads") p->lookaheadThreads = atoi(value);<br>
OPT("opt-cu-delta-qp") p->bOptCUDeltaQP = atobool(value);<br>
OPT("multi-pass-opt-analysis") p->analysisMultiPassRefine = atobool(value);<br>
@@ -1596,9 +1599,16 @@<br>
CHECK(param->rdLevel < 1 || param->rdLevel > 6,<br>
"RD Level is out of range");<br>
CHECK(param->rdoqLevel < 0 || param->rdoqLevel > 2,<br>
- "RDOQ Level is out of range");<br>
+ "RDOQ Level is out of range");<br>
CHECK(param->dynamicRd < 0 || param->dynamicRd > x265_ADAPT_RD_STRENGTH,<br>
- "Dynamic RD strength must be between 0 and 4");<br>
+ "Dynamic RD strength must be between 0 and 4");<br>
+ CHECK(param->bEnableRecursionSkip > 3 || param->bEnableRecursionSkip < 0,<br>
+ "Invalid Recursion skip mode. Valid modes 0,1,2,3");<br>
+ if (param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
+ CHECK(param->edgeThreshold < 0.0f || param->edgeThreshold > 1.0f,<br>
+ "Minimum edge density percentage for a CU should be an integer between 0 to 100");<br>
+ }<br>
CHECK(param->bframes && param->bframes >= param->lookaheadDepth && !param->rc.bStatRead,<br>
"Lookahead depth must be greater than the max consecutive bframe count");<br>
CHECK(param->bframes < 0,<br>
@@ -1908,7 +1918,9 @@<br>
TOOLVAL(param->psyRdoq, "psy-rdoq=%.2lf");<br>
TOOLOPT(param->bEnableRdRefine, "rd-refine");<br>
TOOLOPT(param->bEnableEarlySkip, "early-skip");<br>
- TOOLOPT(param->bEnableRecursionSkip, "rskip");<br>
+ TOOLVAL(param->bEnableRecursionSkip, "rskip mode=%d");<br>
+ if (param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ TOOLVAL(param->edgeThreshold, "rskip-threshold=%.2f");<br>
TOOLOPT(param->bEnableSplitRdSkip, "splitrd-skip");<br>
TOOLVAL(param->noiseReductionIntra, "nr-intra=%d");<br>
TOOLVAL(param->noiseReductionInter, "nr-inter=%d");<br>
@@ -2067,6 +2079,9 @@<br>
s += sprintf(s, " selective-sao=%d", p->selectiveSAO);<br>
BOOL(p->bEnableEarlySkip, "early-skip");<br>
BOOL(p->bEnableRecursionSkip, "rskip");<br>
+ if (p->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ s += sprintf(s, " edge-threshold=%f", p->edgeThreshold);<br>
+<br>
BOOL(p->bEnableFastIntra, "fast-intra");<br>
BOOL(p->bEnableTSkipFast, "tskip-fast");<br>
BOOL(p->bCULossless, "cu-lossless");<br>
@@ -2374,6 +2389,7 @@<br>
dst->rdLevel = src->rdLevel;<br>
dst->bEnableEarlySkip = src->bEnableEarlySkip;<br>
dst->bEnableRecursionSkip = src->bEnableRecursionSkip;<br>
+ dst->edgeThreshold = src->edgeThreshold;<br>
dst->bEnableFastIntra = src->bEnableFastIntra;<br>
dst->bEnableTSkipFast = src->bEnableTSkipFast;<br>
dst->bCULossless = src->bCULossless;<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/pixel.cpp<br>
--- a/source/common/pixel.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/pixel.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -876,6 +876,18 @@<br>
}<br>
}<br>
<br>
+static void planecopy_pp_shr_c(const pixel* src, intptr_t srcStride, pixel* dst, intptr_t dstStride, int width, int height, int shift)<br>
+{<br>
+ for (int r = 0; r < height; r++)<br>
+ {<br>
+ for (int c = 0; c < width; c++)<br>
+ dst[c] = (pixel)((src[c] >> shift));<br>
+<br>
+ dst += dstStride;<br>
+ src += srcStride;<br>
+ }<br>
+}<br>
+<br>
static void planecopy_sp_shl_c(const uint16_t* src, intptr_t srcStride, pixel* dst, intptr_t dstStride, int width, int height, int shift, uint16_t mask)<br>
{<br>
for (int r = 0; r < height; r++)<br>
@@ -1316,6 +1328,7 @@<br>
p.planecopy_cp = planecopy_cp_c;<br>
p.planecopy_sp = planecopy_sp_c;<br>
p.planecopy_sp_shl = planecopy_sp_shl_c;<br>
+ p.planecopy_pp_shr = planecopy_pp_shr_c;<br>
#if HIGH_BIT_DEPTH<br>
p.planeClipAndMax = planeClipAndMax_c;<br>
#endif<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/common/primitives.h<br>
--- a/source/common/primitives.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/common/primitives.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -204,6 +204,7 @@<br>
typedef void (*sign_t)(int8_t *dst, const pixel *src1, const pixel *src2, const int endX);<br>
typedef void (*planecopy_cp_t) (const uint8_t* src, intptr_t srcStride, pixel* dst, intptr_t dstStride, int width, int height, int shift);<br>
typedef void (*planecopy_sp_t) (const uint16_t* src, intptr_t srcStride, pixel* dst, intptr_t dstStride, int width, int height, int shift, uint16_t mask);<br>
+typedef void (*planecopy_pp_t) (const pixel* src, intptr_t srcStride, pixel* dst, intptr_t dstStride, int width, int height, int shift);<br>
typedef pixel (*planeClipAndMax_t)(pixel *src, intptr_t stride, int width, int height, uint64_t *outsum, const pixel minPix, const pixel maxPix);<br>
<br>
typedef void (*cutree_propagate_cost) (int* dst, const uint16_t* propagateIn, const int32_t* intraCosts, const uint16_t* interCosts, const int32_t* invQscales, const double* fpsFactor, int len);<br>
@@ -358,6 +359,7 @@<br>
planecopy_cp_t planecopy_cp;<br>
planecopy_sp_t planecopy_sp;<br>
planecopy_sp_t planecopy_sp_shl;<br>
+ planecopy_pp_t planecopy_pp_shr;<br>
planeClipAndMax_t planeClipAndMax;<br>
<br>
weightp_sp_t weight_sp;<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/analysis.cpp<br>
--- a/source/encoder/analysis.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/analysis.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -1317,12 +1317,21 @@<br>
if (md.bestMode && m_param->bEnableRecursionSkip && !bCtuInfoCheck && !(m_param->bAnalysisType == AVC_INFO && m_param->analysisLoadReuseLevel == 7 && (m_modeFlag[0] || m_modeFlag[1])))<br>
{<br>
skipRecursion = md.bestMode->cu.isSkipped(0);<br>
- if (mightSplit && depth >= minDepth && !skipRecursion)<br>
+ if (mightSplit && !skipRecursion)<br>
{<br>
- if (depth)<br>
- skipRecursion = recursionDepthCheck(parentCTU, cuGeom, *md.bestMode);<br>
- if (m_bHD && !skipRecursion && m_param->rdLevel == 2 && md.fencYuv.m_size != MAX_CU_SIZE)<br>
+ if (depth >= minDepth && m_param->bEnableRecursionSkip == RDCOST_BASED_RSKIP)<br>
+ {<br>
+ if (depth)<br>
+ skipRecursion = recursionDepthCheck(parentCTU, cuGeom, *md.bestMode);<br>
+ if (m_bHD && !skipRecursion && m_param->rdLevel == 2 && md.fencYuv.m_size != MAX_CU_SIZE)<br>
+ skipRecursion = complexityCheckCU(*md.bestMode);<br>
+ }<br>
+ else if (cuGeom.log2CUSize >= MAX_LOG2_CU_SIZE - 1 && m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
skipRecursion = complexityCheckCU(*md.bestMode);<br>
+ }<br>
+ else if (m_param->bEnableRecursionSkip > EDGE_BASED_RSKIP)<br>
+ skipRecursion = true;</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
}<br>
}<br>
if (m_param->bAnalysisType == AVC_INFO && md.bestMode && cuGeom.numPartitions <= 16 && m_param->analysisLoadReuseLevel == 7)<br>
@@ -2015,8 +2024,12 @@<br>
checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);<br>
checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);<br>
<br>
- if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)<br>
+ if (m_param->bEnableRecursionSkip == RDCOST_BASED_RSKIP && depth && m_modeDepth[depth - 1].bestMode)<br>
skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);<br>
+ else if (cuGeom.log2CUSize >= MAX_LOG2_CU_SIZE - 1 && m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ skipRecursion = md.bestMode && complexityCheckCU(*md.bestMode);<br>
+ else if (m_param->bEnableRecursionSkip > EDGE_BASED_RSKIP)<br>
+ skipRecursion = true;<br>
}<br>
if (m_param->bAnalysisType == AVC_INFO && md.bestMode && cuGeom.numPartitions <= 16 && m_param->analysisLoadReuseLevel == 7)<br>
skipRecursion = true;<br>
@@ -3525,26 +3538,47 @@<br>
<br>
bool Analysis::complexityCheckCU(const Mode& bestMode)<br>
{<br>
- uint32_t mean = 0;<br>
- uint32_t homo = 0;<br>
- uint32_t cuSize = bestMode.fencYuv->m_size;<br>
- for (uint32_t y = 0; y < cuSize; y++) {<br>
- for (uint32_t x = 0; x < cuSize; x++) {<br>
- mean += (bestMode.fencYuv->m_buf[0][y * cuSize + x]);<br>
+ if (m_param->bEnableRecursionSkip == RDCOST_BASED_RSKIP)<br></blockquote><div>[KS]
bEnableRecursionSkip ==
RDCOST/EDGE_BASED_RSKIP is checked twice unnecessarily (you do it once before complexityCheck CU call). Can you optimize this code?</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+ {<br>
+ uint32_t mean = 0;<br>
+ uint32_t homo = 0;<br>
+ uint32_t cuSize = bestMode.fencYuv->m_size;<br>
+ for (uint32_t y = 0; y < cuSize; y++) {<br>
+ for (uint32_t x = 0; x < cuSize; x++) {<br>
+ mean += (bestMode.fencYuv->m_buf[0][y * cuSize + x]);<br>
+ }<br>
}<br>
+ mean = mean / (cuSize * cuSize);<br>
+ for (uint32_t y = 0; y < cuSize; y++) {<br>
+ for (uint32_t x = 0; x < cuSize; x++) {<br>
+ homo += abs(int(bestMode.fencYuv->m_buf[0][y * cuSize + x] - mean));<br>
+ }<br>
+ }<br>
+ homo = homo / (cuSize * cuSize);<br>
+<br>
+ if (homo < (.1 * mean))<br>
+ return true;<br>
+<br>
+ return false;<br>
}<br>
- mean = mean / (cuSize * cuSize);<br>
- for (uint32_t y = 0 ; y < cuSize; y++){<br>
- for (uint32_t x = 0 ; x < cuSize; x++){<br>
- homo += abs(int(bestMode.fencYuv->m_buf[0][y * cuSize + x] - mean));<br>
- }<br>
+ else if (m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br></blockquote><div>[KS] same here </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+ {<br>
+ int blockType = bestMode.cu.m_log2CUSize[0] - 2;<br>
+ int shift = bestMode.cu.m_log2CUSize[0] * 2;<br>
+ intptr_t stride = m_frame->m_fencPic->m_stride;<br>
+ intptr_t blockOffsetLuma = bestMode.cu.m_cuPelX + bestMode.cu.m_cuPelY * stride;<br>
+ uint64_t sum_ss = <a href="http://primitives.cu" rel="noreferrer" target="_blank">primitives.cu</a>[blockType].var(m_frame->m_edgeBitPic + blockOffsetLuma, stride);<br>
+ uint32_t sum = (uint32_t)sum_ss;<br>
+ uint32_t ss = (uint32_t)(sum_ss >> 32);<br>
+ uint32_t pixelCount = 1 << shift;<br>
+ double cuEdgeVariance = (ss - ((double)sum * sum / pixelCount)) / pixelCount;<br>
+ if (cuEdgeVariance > (double)m_param->edgeThreshold)<br>
+ return false;<br>
+ else<br>
+ return true;<br>
}<br></blockquote><div>[KS] Earlier for my question to combine edgeRecursion with complexityCheck, you mentioned that - Homogeneity and variance are two different metrics and they don't do the same functionality and so you didn't combine. I would like to understand the reasoning behind combining them in this patch.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
- homo = homo / (cuSize * cuSize);<br>
-<br>
- if (homo < (.1 * mean))<br>
- return true;<br>
-<br>
- return false;<br>
+ else<br>
+ return false;<br></blockquote><div>[KS] When does the encoder hit this final "else"?</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
}<br>
<br>
uint32_t Analysis::calculateCUVariance(const CUData& ctu, const CUGeom& cuGeom)<br>
@@ -3570,7 +3604,6 @@<br>
cnt++;<br>
}<br>
}<br>
- <br>
return cuVariance / cnt;<br>
}<br>
<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/analysis.h<br>
--- a/source/encoder/analysis.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/analysis.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -52,7 +52,7 @@<br>
splitRefs = 0;<br>
mvCost[0] = 0; // L0<br>
mvCost[1] = 0; // L1<br>
- sa8dCost = 0;<br>
+ sa8dCost = 0;<br>
}<br>
};<br>
<br>
@@ -120,7 +120,6 @@<br>
<br>
Mode& compressCTU(CUData& ctu, Frame& frame, const CUGeom& cuGeom, const Entropy& initialContext);<br>
int32_t loadTUDepth(CUGeom cuGeom, CUData parentCTU);<br>
-<br>
protected:<br>
/* Analysis data for save/load mode, writes/reads data based on absPartIdx */<br>
x265_analysis_inter_data* m_reuseInterDataCTU;<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/encoder.cpp<br>
--- a/source/encoder/encoder.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/encoder.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -1343,9 +1343,9 @@<br>
int32_t numBytes = m_param->sourceBitDepth > 8 ? 2 : 1;<br>
memset(m_edgePic, 0, bufSize * numBytes);<br>
<br>
- if (!computeEdge(m_edgePic, src, NULL, pic->width, pic->height, pic->width, false))<br>
- {<br>
- x265_log(m_param, X265_LOG_ERROR, "Failed edge computation!");<br>
+ if (!computeEdge(m_edgePic, src, NULL, pic->width, pic->height, pic->width, false, 1))<br>
+ {<br>
+ x265_log(m_param, X265_LOG_ERROR, "Failed to compute edge!");<br>
return false;<br>
}<br>
<br>
@@ -1660,6 +1660,12 @@<br>
}<br>
}<br>
}<br>
+ if (m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP && m_param->bHistBasedSceneCut)<br>
+ {<br>
+ pixel* src = m_edgePic;<br>
+ primitives.planecopy_pp_shr(src, inFrame->m_fencPic->m_picWidth, inFrame->m_edgeBitPic, inFrame->m_fencPic->m_stride,<br>
+ inFrame->m_fencPic->m_picWidth, inFrame->m_fencPic->m_picHeight, 0);<br>
+ }<br>
}<br>
else<br>
{<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/frameencoder.cpp<br>
--- a/source/encoder/frameencoder.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/frameencoder.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -130,7 +130,7 @@<br>
{<br>
rowSum += sliceGroupSizeAccu;<br>
m_sliceBaseRow[++sidx] = i;<br>
- } <br>
+ }<br>
}<br>
X265_CHECK(sidx < m_param->maxSlices, "sliceID check failed!");<br>
m_sliceBaseRow[0] = 0;<br>
@@ -268,6 +268,19 @@<br>
curFrame->m_encData->m_jobProvider = this;<br>
curFrame->m_encData->m_slice->m_mref = m_mref;<br>
<br>
+ if (!m_param->bHistBasedSceneCut && m_param->rc.aqMode != X265_AQ_EDGE && m_param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
+ int height = curFrame->m_fencPic->m_picHeight;<br>
+ int width = curFrame->m_fencPic->m_picWidth;<br>
+ intptr_t stride = curFrame->m_fencPic->m_stride;<br>
+<br>
+ if (!computeEdge(curFrame->m_edgeBitPic, curFrame->m_fencPic->m_picOrg[0], NULL, stride, height, width, false, 1))<br>
+ {<br>
+ x265_log(m_param, X265_LOG_ERROR, " Failed to compute edge !");<br>
+ return false;<br>
+ }<br>
+ }<br>
+<br>
if (!m_cuGeoms)<br>
{<br>
if (!initializeGeoms())<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/slicetype.cpp<br>
--- a/source/encoder/slicetype.cpp Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/slicetype.cpp Wed Jan 29 12:19:07 2020 +0530<br>
@@ -87,7 +87,7 @@<br>
<br>
namespace X265_NS {<br>
<br>
-bool computeEdge(pixel *edgePic, pixel *refPic, pixel *edgeTheta, intptr_t stride, int height, int width, bool bcalcTheta)<br>
+bool computeEdge(pixel* edgePic, pixel* refPic, pixel* edgeTheta, intptr_t stride, int height, int width, bool bcalcTheta, pixel whitePixel)<br>
{<br>
intptr_t rowOne = 0, rowTwo = 0, rowThree = 0, colOne = 0, colTwo = 0, colThree = 0;<br>
intptr_t middle = 0, topLeft = 0, topRight = 0, bottomLeft = 0, bottomRight = 0;<br>
@@ -141,7 +141,7 @@<br>
theta = 180 + theta;<br>
edgeTheta[middle] = (pixel)theta;<br>
}<br>
- edgePic[middle] = (pixel)(gradientMagnitude >= edgeThreshold ? edgeThreshold : blackPixel);<br>
+ edgePic[middle] = (pixel)(gradientMagnitude >= EDGE_THRESHOLD ? whitePixel : blackPixel);<br>
}<br>
}<br>
return true;<br>
@@ -519,6 +519,13 @@<br>
if (param->rc.aqMode == X265_AQ_EDGE)<br>
edgeFilter(curFrame, param);<br>
<br>
+ if (param->rc.aqMode == X265_AQ_EDGE && !param->bHistBasedSceneCut && param->bEnableRecursionSkip >= EDGE_BASED_RSKIP)<br>
+ {<br>
+ pixel* src = curFrame->m_edgePic + curFrame->m_fencPic->m_lumaMarginY * curFrame->m_fencPic->m_stride + curFrame->m_fencPic->m_lumaMarginX;<br>
+ primitives.planecopy_pp_shr(src, curFrame->m_fencPic->m_stride, curFrame->m_edgeBitPic,<br>
+ curFrame->m_fencPic->m_stride, curFrame->m_fencPic->m_picWidth, curFrame->m_fencPic->m_picHeight, SHIFT_TO_BITPLANE);<br>
+ }<br>
+<br>
if (param->rc.aqMode == X265_AQ_AUTO_VARIANCE || param->rc.aqMode == X265_AQ_AUTO_VARIANCE_BIASED || param->rc.aqMode == X265_AQ_EDGE)<br>
{<br>
double bit_depth_correction = 1.f / (1 << (2 * (X265_DEPTH - 8)));<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/encoder/slicetype.h<br>
--- a/source/encoder/slicetype.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/encoder/slicetype.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -44,9 +44,9 @@<br>
#define EDGE_INCLINATION 45<br>
<br>
#if HIGH_BIT_DEPTH<br>
-#define edgeThreshold 1023.0<br>
+#define EDGE_THRESHOLD 1023.0<br>
#else<br>
-#define edgeThreshold 255.0<br>
+#define EDGE_THRESHOLD 255.0<br>
#endif<br>
#define PI 3.14159265<br>
<br>
@@ -101,7 +101,7 @@<br>
protected:<br>
<br>
uint32_t acEnergyCu(Frame* curFrame, uint32_t blockX, uint32_t blockY, int csp, uint32_t qgSize);<br>
- uint32_t edgeDensityCu(Frame*curFrame, uint32_t &avgAngle, uint32_t blockX, uint32_t blockY, uint32_t qgSize);<br>
+ uint32_t edgeDensityCu(Frame* curFrame, uint32_t &avgAngle, uint32_t blockX, uint32_t blockY, uint32_t qgSize);<br>
uint32_t lumaSumCu(Frame* curFrame, uint32_t blockX, uint32_t blockY, uint32_t qgSize);<br>
uint32_t weightCostLuma(Lowres& fenc, Lowres& ref, WeightParam& wp);<br>
bool allocWeightedRef(Lowres& fenc);<br>
@@ -265,7 +265,6 @@<br>
CostEstimateGroup& operator=(const CostEstimateGroup&);<br>
};<br>
<br>
-bool computeEdge(pixel *edgePic, pixel *refPic, pixel *edgeTheta, intptr_t stride, int height, int width, bool bcalcTheta);<br>
-<br>
+bool computeEdge(pixel* edgePic, pixel* refPic, pixel* edgeTheta, intptr_t stride, int height, int width, bool bcalcTheta, pixel whitePixel = EDGE_THRESHOLD);<br>
}<br>
#endif // ifndef X265_SLICETYPE_H<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/test/regression-tests.txt<br>
--- a/source/test/regression-tests.txt Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/test/regression-tests.txt Wed Jan 29 12:19:07 2020 +0530<br>
@@ -162,7 +162,11 @@<br>
sintel_trailer_2k_1920x1080_24.yuv, --preset medium --hist-scenecut --hist-threshold 0.02 --frame-dup --dup-threshold 60 --hrd --bitrate 10000 --vbv-bufsize 15000 --vbv-maxrate 12000<br>
sintel_trailer_2k_1920x1080_24.yuv, --preset medium --hist-scenecut --hist-threshold 0.02<br>
sintel_trailer_2k_1920x1080_24.yuv, --preset ultrafast --hist-scenecut --hist-threshold 0.02<br>
-<br>
+crowd_run_1080p50.yuv, --preset faster --ctu 32 --rskip 2 --edge-threshold 5<br>
+crowd_run_1080p50.yuv, --preset fast --ctu 64 --rskip 2 --edge-threshold 5 --aq-mode 4<br>
+crowd_run_1080p50.yuv, --preset slow --ctu 32 --rskip 2 --edge-threshold 5 --hist-scenecut --hist-threshold 0.1<br>
+crowd_run_1080p50.yuv, --preset slower --ctu 16 --rskip 2 --edge-threshold 5 --hist-scenecut --hist-threshold 0.1 --aq-mode 4<br>
+ <br>
# Main12 intraCost overflow bug test<br>
720p50_parkrun_ter.y4m,--preset medium<br>
<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/x265.h<br>
--- a/source/x265.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/x265.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -1255,9 +1255,9 @@<br>
* skip blocks. Default is disabled */<br>
int bEnableEarlySkip;<br>
<br>
- /* Enable early CU size decisions to avoid recursing to higher depths. <br>
+ /* Enable early CU size decisions to avoid recursing to higher depths.<br>
* Default is enabled */<br>
- int bEnableRecursionSkip;<br>
+ int bEnableRecursionSkip;<br>
<br>
/* Use a faster search method to find the best intra mode. Default is 0 */<br>
int bEnableFastIntra;<br>
@@ -1857,7 +1857,7 @@<br>
double edgeTransitionThreshold;<br>
<br>
/* Enables histogram based scenecut detection algorithm to detect scenecuts. Default disabled */<br>
- int bHistBasedSceneCut;<br>
+ int bHistBasedSceneCut;<br>
<br>
/* Enable HME search ranges for L0, L1 and L2 respectively. */<br>
int hmeRange[3];<br>
@@ -1874,7 +1874,7 @@<br>
* analysis information stored in analysis-save. Higher the refine level higher<br>
* the information stored. Default is 5 */<br>
int analysisSaveReuseLevel;<br>
- <br>
+<br>
/* A value between 1 and 10 (both inclusive) determines the level of<br>
* analysis information reused in analysis-load. Higher the refine level higher<br>
* the information reused. Default is 5 */<br>
@@ -1901,6 +1901,9 @@<br>
* info is available from the corresponding analysis-save. */<br>
<br>
int confWinBottomOffset;<br>
+<br>
+ /* Edge variance threshold for quad tree establishment. */<br>
+ float edgeThreshold;<br>
} x265_param;<br>
<br>
/* x265_param_alloc:<br>
diff -r fdbd4e4a2aff -r e9c8c0089bdd source/x265cli.h<br>
--- a/source/x265cli.h Sat Jan 25 18:08:03 2020 +0530<br>
+++ b/source/x265cli.h Wed Jan 29 12:19:07 2020 +0530<br>
@@ -105,8 +105,8 @@<br>
{ "amp", no_argument, NULL, 0 },<br>
{ "no-early-skip", no_argument, NULL, 0 },<br>
{ "early-skip", no_argument, NULL, 0 },<br>
- { "no-rskip", no_argument, NULL, 0 },<br>
- { "rskip", no_argument, NULL, 0 },<br>
+ { "rskip", required_argument, NULL, 0 },<br>
+ { "edge-threshold", required_argument, NULL, 0 },<br>
{ "no-fast-cbf", no_argument, NULL, 0 },<br>
{ "fast-cbf", no_argument, NULL, 0 },<br>
{ "no-tskip", no_argument, NULL, 0 },<br>
@@ -457,7 +457,9 @@<br>
H0(" --[no-]ssim-rd Enable ssim rate distortion optimization, 0 to disable. Default %s\n", OPT(param->bSsimRd));<br>
H0(" --[no-]rd-refine Enable QP based RD refinement for rd levels 5 and 6. Default %s\n", OPT(param->bEnableRdRefine));<br>
H0(" --[no-]early-skip Enable early SKIP detection. Default %s\n", OPT(param->bEnableEarlySkip));<br>
- H0(" --[no-]rskip Enable early exit from recursion. Default %s\n", OPT(param->bEnableRecursionSkip));<br>
+ H0(" --rskip <mode> Set mode for early exit from recursion. Mode 1: exit using rdcost. Mode 2: exit using edge density. Mode 3: exit using edge density with forceful skip for small sized CU's."<br>
+ " Mode 0: disabled. Default %s\n", OPT(param->bEnableRecursionSkip));<br>
+ H1(" --edge-threshold Threshold in terms of percentage for minimum edge density in CUs to terminate the recursion depth. Applicable only for rskip modes 2 and 3. Default %s\n", OPT(param->edgeThreshold));<br>
H1(" --[no-]tskip-fast Enable fast intra transform skipping. Default %s\n", OPT(param->bEnableTSkipFast));<br>
H1(" --[no-]splitrd-skip Enable skipping split RD analysis when sum of split CU rdCost larger than one split CU rdCost for Intra CU. Default %s\n", OPT(param->bEnableSplitRdSkip));<br>
H1(" --nr-intra <integer> An integer value in range of 0 to 2000, which denotes strength of noise reduction in intra CUs. Default 0\n");<br>
_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org" target="_blank">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" rel="noreferrer" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><span style="color:rgb(0,0,0)">Regards,<br>Kavitha</span></div></div></div></div></div>