<div dir="ltr">else qp = m_qp[0][0], will not work, since you want the QP from the last allowed depth. eg, at 8x8 CU, you want the QP from the top-level 16x16 CU, not m_qp[0][0] which is the CTU averaged QP.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 6, 2015 at 10:05 AM, Deepthi Nandakumar <span dir="ltr"><<a href="mailto:deepthi@multicorewareinc.com" target="_blank">deepthi@multicorewareinc.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>The initial version of this patch from Gopu had dqp-depth. I tried explaining in English how to set it w.r.t max/minCU Size, then decided qgSize is easier to understand. <br></div>I'm not religious about it, we can always change it back. <br><br></div>About partIdx, I meant to check with you/Ashok. Some combination of cuGeom.absPartIdx and depth should be sufficient, but it wasnt working out. Let me take another crack at it, even qp can be avoided in that case. <br><br></div>For some reason, quant QP and search QP are configured separately (Quant::setQPforQuant and Search::setQP).<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 6, 2015 at 12:18 AM, Steve Borho <span dir="ltr"><<a href="mailto:steve@borho.org" target="_blank">steve@borho.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On 04/05, <a href="mailto:deepthi@multicorewareinc.com" target="_blank">deepthi@multicorewareinc.com</a> wrote:<br>
> # HG changeset patch<br>
> # User Deepthi Nandakumar <<a href="mailto:deepthi@multicorewareinc.com" target="_blank">deepthi@multicorewareinc.com</a>><br>
> # Date 1427100822 -19800<br>
> #Â Â Â Mon Mar 23 14:23:42 2015 +0530<br>
> # Node ID d6e059bd8a9cd0cb9aad7444b1a141a59ac01193<br>
> # Parent 335c728bbd62018e1e3ed03a4df0514c213e9a4e<br>
> aq: implementation of fine-grained adaptive quantization<br>
><br>
> Currently adaptive quantization adjusts the QP values on 64x64 pixel CodingTree<br>
> units (CTUs) across a video frame. The new param option --qg-size will<br>
> enable QP to be adjusted to individual quantization groups (QGs) of size 64/32/16<br>
><br>
> diff -r 335c728bbd62 -r d6e059bd8a9c doc/reST/cli.rst<br>
> --- a/doc/reST/cli.rst    Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/doc/reST/cli.rst    Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -1111,6 +1111,13 @@<br>
><br>
>Â Â Â Â **Range of values:** 0.0 to 3.0<br>
><br>
> +.. option:: --qg-size <64|32|16><br>
> +Â Â Â Enable adaptive quantization for sub-CTUs. This parameter specifies<br>
> +Â Â Â the minimum CU size at which QP can be adjusted, ie. Quantization Group<br>
> +Â Â Â size. Allowed range of values are 64, 32, 16 provided this falls within<br>
> +Â Â Â the inclusive range [maxCUSize, minCUSize]. Experimental.<br>
> +Â Â Â Default: same as maxCUSize<br>
<br>
</span>I can't decide if this should be quant group size or quant group depth - pros and<br>
cons both ways<br>
<span><br>
>Â .. option:: --cutree, --no-cutree<br>
><br>
>Â Â Â Â Enable the use of lookahead's lowres motion vector fields to<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/common/cudata.cpp<br>
> --- a/source/common/cudata.cpp    Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/common/cudata.cpp    Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -298,7 +298,7 @@<br>
>Â }<br>
><br>
>Â // initialize Sub partition<br>
> -void CUData::initSubCU(const CUData& ctu, const CUGeom& cuGeom)<br>
> +void CUData::initSubCU(const CUData& ctu, const CUGeom& cuGeom, int qp)<br>
>Â {<br>
>Â Â Â m_absIdxInCTUÂ Â = cuGeom.absPartIdx;<br>
>   m_encData    = ctu.m_encData;<br>
> @@ -312,8 +312,8 @@<br>
>   m_cuAboveRight = ctu.m_cuAboveRight;<br>
>Â Â Â X265_CHECK(m_numPartitions == cuGeom.numPartitions, "initSubCU() size mismatch\n");<br>
><br>
> -Â Â /* sequential memsets */<br>
> -Â Â m_partSet((uint8_t*)m_qp, (uint8_t)ctu.m_qp[0]);<br>
> +Â Â m_partSet((uint8_t*)m_qp, (uint8_t)qp);<br>
<br>
</span>longer term, this could probably be simplified. if all CU modes are<br>
evaluated at the same QP, there's no point in setting this value in each<br>
sub-CU. we could derive the CTU's final m_qp[] based on the depth at<br>
each coded CU at the end of analysis; and avoid all these memsets<br>
<span><br>
>   m_partSet(m_log2CUSize,  (uint8_t)cuGeom.log2CUSize);<br>
>Â Â Â m_partSet(m_lumaIntraDir, (uint8_t)DC_IDX);<br>
>   m_partSet(m_tqBypass,   (uint8_t)m_encData->m_param->bLossless);<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/common/cudata.h<br>
> --- a/source/common/cudata.h Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/common/cudata.h Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -182,7 +182,7 @@<br>
>Â Â Â static void calcCTUGeoms(uint32_t ctuWidth, uint32_t ctuHeight, uint32_t maxCUSize, uint32_t minCUSize, CUGeom cuDataArray[CUGeom::MAX_GEOMS]);<br>
><br>
>   void   initCTU(const Frame& frame, uint32_t cuAddr, int qp);<br>
> -  void   initSubCU(const CUData& ctu, const CUGeom& cuGeom);<br>
> +  void   initSubCU(const CUData& ctu, const CUGeom& cuGeom, int qp);<br>
>   void   initLosslessCU(const CUData& cu, const CUGeom& cuGeom);<br>
><br>
>   void   copyPartFrom(const CUData& cu, const CUGeom& childGeom, uint32_t subPartIdx);<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/common/param.cpp<br>
> --- a/source/common/param.cpp Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/common/param.cpp Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -209,6 +209,7 @@<br>
>Â Â Â param->rc.zones = NULL;<br>
>Â Â Â param->rc.bEnableSlowFirstPass = 0;<br>
>Â Â Â param->rc.bStrictCbr = 0;<br>
> +Â Â param->rc.QGSize = 64; /* Same as maxCUSize */<br>
<br>
</span>if this was quantGroupDepth we could configure it as 1 and not care<br>
about preset or CTU size<br>
<div><div><br>
>Â Â Â /* Video Usability Information (VUI) */<br>
>Â Â Â param->vui.aspectRatioIdc = 0;<br>
> @@ -263,6 +264,7 @@<br>
>Â Â Â Â Â Â Â param->rc.aqStrength = 0.0;<br>
>Â Â Â Â Â Â Â param->rc.aqMode = X265_AQ_NONE;<br>
>Â Â Â Â Â Â Â param->rc.cuTree = 0;<br>
> +Â Â Â Â Â Â param->rc.QGSize = 32;<br>
>Â Â Â Â Â Â Â param->bEnableFastIntra = 1;<br>
>Â Â Â Â Â }<br>
>Â Â Â Â Â else if (!strcmp(preset, "superfast"))<br>
> @@ -279,6 +281,7 @@<br>
>Â Â Â Â Â Â Â param->rc.aqStrength = 0.0;<br>
>Â Â Â Â Â Â Â param->rc.aqMode = X265_AQ_NONE;<br>
>Â Â Â Â Â Â Â param->rc.cuTree = 0;<br>
> +Â Â Â Â Â Â param->rc.QGSize = 32;<br>
>Â Â Â Â Â Â Â param->bEnableSAO = 0;<br>
>Â Â Â Â Â Â Â param->bEnableFastIntra = 1;<br>
>Â Â Â Â Â }<br>
> @@ -292,6 +295,7 @@<br>
>Â Â Â Â Â Â Â param->rdLevel = 2;<br>
>Â Â Â Â Â Â Â param->maxNumReferences = 1;<br>
>Â Â Â Â Â Â Â param->rc.cuTree = 0;<br>
> +Â Â Â Â Â Â param->rc.QGSize = 32;<br>
>Â Â Â Â Â Â Â param->bEnableFastIntra = 1;<br>
>Â Â Â Â Â }<br>
>Â Â Â Â Â else if (!strcmp(preset, "faster"))<br>
> @@ -843,6 +847,7 @@<br>
>Â Â Â OPT2("pools", "numa-pools") p->numaPools = strdup(value);<br>
>Â Â Â OPT("lambda-file") p->rc.lambdaFileName = strdup(value);<br>
>Â Â Â OPT("analysis-file") p->analysisFileName = strdup(value);<br>
> +Â Â OPT("qg-size") p->rc.QGSize = atoi(value);<br>
>Â Â Â else<br>
>Â Â Â Â Â return X265_PARAM_BAD_NAME;<br>
>Â #undef OPT<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/encoder/analysis.cpp<br>
> --- a/source/encoder/analysis.cpp   Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/encoder/analysis.cpp   Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -75,6 +75,8 @@<br>
>Â Â Â m_reuseInterDataCTU = NULL;<br>
>Â Â Â m_reuseRef = NULL;<br>
>Â Â Â m_reuseBestMergeCand = NULL;<br>
> +Â Â for (int i = 0; i < NUM_CU_DEPTH; i++)<br>
> +Â Â Â Â m_qp[i] = NULL;<br>
>Â }<br>
><br>
>Â bool Analysis::create(ThreadLocalData *tld)<br>
> @@ -101,6 +103,7 @@<br>
>Â Â Â Â Â Â Â ok &= md.pred[j].reconYuv.create(cuSize, csp);<br>
>Â Â Â Â Â Â Â md.pred[j].fencYuv = &md.fencYuv;<br>
>Â Â Â Â Â }<br>
> +Â Â Â Â m_qp[depth] = X265_MALLOC(int, 1i64 << (depth << 1));<br>
<br>
</div></div>checked malloc<br>
<span><br>
>Â Â Â }<br>
><br>
>Â Â Â return ok;<br>
> @@ -118,6 +121,7 @@<br>
>Â Â Â Â Â Â Â m_modeDepth[i].pred[j].predYuv.destroy();<br>
>Â Â Â Â Â Â Â m_modeDepth[i].pred[j].reconYuv.destroy();<br>
>Â Â Â Â Â }<br>
> +Â Â Â Â X265_FREE(m_qp[i]);<br>
>Â Â Â }<br>
>Â }<br>
><br>
> @@ -132,6 +136,34 @@<br>
>Â Â Â Â Â Â Â m_modeDepth[i].pred[j].invalidate();<br>
>Â #endif<br>
>Â Â Â invalidateContexts(0);<br>
> +Â Â if (m_slice->m_pps->bUseDQP)<br>
> +Â Â {<br>
> +    CUGeom *curCUGeom = (CUGeom *)&cuGeom;<br>
> +Â Â Â Â CUGeom *parentGeom = (CUGeom *)&cuGeom;<br>
<br>
</span>these should probably be kept const<br>
<span><br>
> +<br>
> +Â Â Â Â m_qp[0][0] = calculateQpforCuSize(ctu, *curCUGeom);<br>
> +Â Â Â Â curCUGeom = curCUGeom + curCUGeom->childOffset;<br>
> +Â Â Â Â parentGeom = curCUGeom;<br>
> +Â Â Â Â if (m_slice->m_pps->maxCuDQPDepth >= 1)<br>
> +Â Â Â Â {<br>
> +Â Â Â Â Â Â for (int i = 0; i < 4; i++)<br>
> +Â Â Â Â Â Â {<br>
> +Â Â Â Â Â Â Â Â m_qp[1][i] = calculateQpforCuSize(ctu, *(parentGeom + i));<br>
> +Â Â Â Â Â Â Â Â if (m_slice->m_pps->maxCuDQPDepth == 2)<br>
> +Â Â Â Â Â Â Â Â {<br>
> +Â Â Â Â Â Â Â Â Â Â curCUGeom = parentGeom + i + (parentGeom + i)->childOffset;<br>
> +Â Â Â Â Â Â Â Â Â Â for (int j = 0; j < 4; j++)<br>
> +Â Â Â Â Â Â Â Â Â Â Â Â m_qp[2][i * 4 + j] = calculateQpforCuSize(ctu, *(curCUGeom + j));<br>
> +Â Â Â Â Â Â Â Â }<br>
> +Â Â Â Â Â Â }<br>
> +Â Â Â Â }<br>
> +Â Â Â Â this->setQP(*m_slice, m_qp[0][0]);<br>
> +Â Â Â Â m_qp[0][0] = x265_clip3(QP_MIN, QP_MAX_SPEC, m_qp[0][0]);<br>
> +Â Â Â Â ctu.setQPSubParts((int8_t)m_qp[0][0], 0, 0);<br>
<br>
</span>So all the QPs at every potential sub-CU are known at the start of CTU<br>
compression. Ok.<br>
<div><div><br>
> +Â Â }<br>
> +Â Â else<br>
> +Â Â Â Â m_qp[0][0] = m_slice->m_sliceQp;<br>
> +<br>
>Â Â Â m_quant.setQPforQuant(ctu);<br>
>Â Â Â m_rqt[0].cur.load(initialContext);<br>
>Â Â Â m_modeDepth[0].fencYuv.copyFromPicYuv(*m_frame->m_fencPic, ctu.m_cuAddr, 0);<br>
> @@ -155,7 +187,7 @@<br>
>Â Â Â uint32_t zOrder = 0;<br>
>Â Â Â if (m_slice->m_sliceType == I_SLICE)<br>
>Â Â Â {<br>
> -Â Â Â Â compressIntraCU(ctu, cuGeom, zOrder);<br>
> +Â Â Â Â compressIntraCU(ctu, cuGeom, zOrder, m_qp[0][0], 0);<br>
>Â Â Â Â Â if (m_param->analysisMode == X265_ANALYSIS_SAVE && m_frame->m_analysisData.intraData)<br>
>Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â CUData *bestCU = &m_modeDepth[0].bestMode->cu;<br>
> @@ -173,18 +205,18 @@<br>
>Â Â Â Â Â Â Â * they are available for intra predictions */<br>
>Â Â Â Â Â Â Â m_modeDepth[0].fencYuv.copyToPicYuv(*m_frame->m_reconPic, ctu.m_cuAddr, 0);<br>
><br>
> -Â Â Â Â Â Â compressInterCU_rd0_4(ctu, cuGeom);<br>
> +Â Â Â Â Â Â compressInterCU_rd0_4(ctu, cuGeom, m_qp[0][0], 0);<br>
><br>
>Â Â Â Â Â Â Â /* generate residual for entire CTU at once and copy to reconPic */<br>
>Â Â Â Â Â Â Â encodeResidue(ctu, cuGeom);<br>
>Â Â Â Â Â }<br>
>Â Â Â Â Â else if (m_param->bDistributeModeAnalysis && m_param->rdLevel >= 2)<br>
> -Â Â Â Â Â Â compressInterCU_dist(ctu, cuGeom);<br>
> +Â Â Â Â Â Â compressInterCU_dist(ctu, cuGeom, m_qp[0][0], 0);<br>
>Â Â Â Â Â else if (m_param->rdLevel <= 4)<br>
> -Â Â Â Â Â Â compressInterCU_rd0_4(ctu, cuGeom);<br>
> +Â Â Â Â Â Â compressInterCU_rd0_4(ctu, cuGeom, m_qp[0][0], 0);<br>
>Â Â Â Â Â else<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â compressInterCU_rd5_6(ctu, cuGeom, zOrder);<br>
> +Â Â Â Â Â Â compressInterCU_rd5_6(ctu, cuGeom, zOrder, m_qp[0][0], 0);<br>
>Â Â Â Â Â Â Â if (m_param->analysisMode == X265_ANALYSIS_SAVE && m_frame->m_analysisData.interData)<br>
>Â Â Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â Â Â CUData *bestCU = &m_modeDepth[0].bestMode->cu;<br>
> @@ -223,7 +255,7 @@<br>
>Â Â Â }<br>
>Â }<br>
><br>
> -void Analysis::compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t& zOrder)<br>
> +void Analysis::compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t& zOrder, int32_t qp, uint32_t partIdx)<br>
>Â {<br>
>Â Â Â uint32_t depth = cuGeom.depth;<br>
>Â Â Â ModeDepth& md = m_modeDepth[depth];<br>
> @@ -232,6 +264,13 @@<br>
>Â Â Â bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);<br>
>Â Â Â bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);<br>
><br>
> +Â Â if (m_slice->m_pps->bUseDQP && depth && depth <= m_slice->m_pps->maxCuDQPDepth)<br>
> +Â Â {<br>
> +Â Â Â Â qp = m_qp[depth][partIdx];<br>
> +Â Â Â Â this->setQP(*m_slice, qp);<br>
<br>
</div></div>if we configured quant QP here, we should be able to remove it<br>
everywhere else, yes?<br>
<span><br>
> +Â Â Â Â qp = x265_clip3(QP_MIN, QP_MAX_SPEC, qp);<br>
> +Â Â }<br>
<br>
</span>not sure I see the point of passing in qp here when all you really need<br>
is:Â Â else qp = m_qp[0][0];<br>
<br>
Also, isn't partIdx derivable from cuGeom? it would be best if we didn't<br>
add yet another indexing scheme. I still think the zOrder argument is<br>
probably unnecessary.<br>
<div><div><br>
> +<br>
>Â Â Â if (m_param->analysisMode == X265_ANALYSIS_LOAD)<br>
>Â Â Â {<br>
>     uint8_t* reuseDepth = &m_reuseIntraDataCTU->depth[parentCTU.m_cuAddr * parentCTU.m_numPartitions];<br>
> @@ -241,11 +280,10 @@<br>
><br>
>Â Â Â Â Â if (mightNotSplit && depth == reuseDepth[zOrder] && zOrder == cuGeom.absPartIdx)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â m_quant.setQPforQuant(parentCTU);<br>
> -<br>
>Â Â Â Â Â Â Â PartSize size = (PartSize)reusePartSizes[zOrder];<br>
>Â Â Â Â Â Â Â Mode& mode = size == SIZE_2Nx2N ? md.pred[PRED_INTRA] : md.pred[PRED_INTRA_NxN];<br>
> -Â Â Â Â Â Â mode.cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â mode.cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â Â Â m_quant.setQPforQuant(<a href="http://mode.cu" target="_blank">mode.cu</a>);<br>
>Â Â Â Â Â Â Â checkIntra(mode, cuGeom, size, &reuseModes[zOrder], &reuseChromaModes[zOrder]);<br>
>Â Â Â Â Â Â Â checkBestMode(mode, depth);<br>
><br>
> @@ -262,15 +300,14 @@<br>
>Â Â Â }<br>
>Â Â Â else if (mightNotSplit)<br>
>Â Â Â {<br>
> -Â Â Â Â m_quant.setQPforQuant(parentCTU);<br>
> -<br>
> -Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â m_quant.setQPforQuant(md.pred[PRED_INTRA].cu);<br>
>Â Â Â Â Â checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N, NULL, NULL);<br>
>Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA], depth);<br>
><br>
>Â Â Â Â Â if (cuGeom.log2CUSize == 3 && m_slice->m_sps->quadtreeTULog2MinSize < 3)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â checkIntra(md.pred[PRED_INTRA_NxN], cuGeom, SIZE_NxN, NULL, NULL);<br>
>Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA_NxN], depth);<br>
>Â Â Â Â Â }<br>
> @@ -287,7 +324,7 @@<br>
>Â Â Â Â Â Mode* splitPred = &md.pred[PRED_SPLIT];<br>
>Â Â Â Â Â splitPred->initCosts();<br>
>Â Â Â Â Â CUData* splitCU = &splitPred->cu;<br>
> -Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom, qp);<br>
><br>
>Â Â Â Â Â uint32_t nextDepth = depth + 1;<br>
>Â Â Â Â Â ModeDepth& nd = m_modeDepth[nextDepth];<br>
> @@ -301,7 +338,7 @@<br>
>Â Â Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â Â Â m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);<br>
>Â Â Â Â Â Â Â Â Â m_rqt[nextDepth].cur.load(*nextContext);<br>
> -Â Â Â Â Â Â Â Â compressIntraCU(parentCTU, childGeom, zOrder);<br>
> +Â Â Â Â Â Â Â Â compressIntraCU(parentCTU, childGeom, zOrder, qp, partIdx * 4 + subPartIdx);<br>
><br>
>Â Â Â Â Â Â Â Â Â // Save best CU and pred data for this sub CU<br>
>Â Â Â Â Â Â Â Â Â splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);<br>
> @@ -490,7 +527,7 @@<br>
>Â Â Â while (task >= 0);<br>
>Â }<br>
><br>
> -void Analysis::compressInterCU_dist(const CUData& parentCTU, const CUGeom& cuGeom)<br>
> +void Analysis::compressInterCU_dist(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp, uint32_t partIdx)<br>
>Â {<br>
>Â Â Â uint32_t depth = cuGeom.depth;<br>
>Â Â Â uint32_t cuAddr = parentCTU.m_cuAddr;<br>
> @@ -503,6 +540,13 @@<br>
><br>
>Â Â Â X265_CHECK(m_param->rdLevel >= 2, "compressInterCU_dist does not support RD 0 or 1\n");<br>
><br>
> +Â Â if (m_slice->m_pps->bUseDQP && depth && depth <= m_slice->m_pps->maxCuDQPDepth)<br>
> +Â Â {<br>
> +Â Â Â Â qp = m_qp[depth][partIdx];<br>
> +Â Â Â Â this->setQP(*m_slice, qp);<br>
> +Â Â Â Â qp = x265_clip3(QP_MIN, QP_MAX_SPEC, qp);<br>
> +Â Â }<br>
> +<br>
>Â Â Â if (mightNotSplit && depth >= minDepth)<br>
>Â Â Â {<br>
>Â Â Â Â Â int bTryAmp = m_slice->m_sps->maxAMPDepth > depth && (cuGeom.log2CUSize < 6 || m_param->rdLevel > 4);<br>
> @@ -511,28 +555,28 @@<br>
>Â Â Â Â Â PMODE pmode(*this, cuGeom);<br>
><br>
>Â Â Â Â Â /* Initialize all prediction CUs based on parentCTU */<br>
> -Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);<br>
> -Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â if (bTryIntra)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â if (cuGeom.log2CUSize == 3 && m_slice->m_sps->quadtreeTULog2MinSize < 3 && m_param->rdLevel >= 5)<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â pmode.modes[pmode.m_jobTotal++] = PRED_INTRA;<br>
>Â Â Â Â Â }<br>
> -Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_2Nx2N;<br>
> -Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_2Nx2N;<br>
> +Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â if (m_param->bEnableRectInter)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_2NxN;<br>
> -Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_Nx2N;<br>
> +Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_2NxN;<br>
> +Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_Nx2N;<br>
>Â Â Â Â Â }<br>
>Â Â Â Â Â if (bTryAmp)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_2NxnU;<br>
> -Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_2NxnD;<br>
> -Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_nLx2N;<br>
> -Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom); pmode.modes[pmode.m_jobTotal++] = PRED_nRx2N;<br>
> +Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_2NxnU;<br>
> +Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_2NxnD;<br>
> +Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_nLx2N;<br>
> +Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp); pmode.modes[pmode.m_jobTotal++] = PRED_nRx2N;<br>
>Â Â Â Â Â }<br>
><br>
>Â Â Â Â Â pmode.tryBondPeers(*m_frame->m_encData->m_jobProvider, pmode.m_jobTotal);<br>
> @@ -662,7 +706,7 @@<br>
><br>
>Â Â Â Â Â if (md.bestMode->rdCost == MAX_INT64 && !bTryIntra)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â checkIntraInInter(md.pred[PRED_INTRA], cuGeom);<br>
>Â Â Â Â Â Â Â encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);<br>
>Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA], depth);<br>
> @@ -688,7 +732,7 @@<br>
>Â Â Â Â Â Mode* splitPred = &md.pred[PRED_SPLIT];<br>
>Â Â Â Â Â splitPred->initCosts();<br>
>Â Â Â Â Â CUData* splitCU = &splitPred->cu;<br>
> -Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom, qp);<br>
><br>
>Â Â Â Â Â uint32_t nextDepth = depth + 1;<br>
>Â Â Â Â Â ModeDepth& nd = m_modeDepth[nextDepth];<br>
> @@ -702,7 +746,7 @@<br>
>Â Â Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â Â Â m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);<br>
>Â Â Â Â Â Â Â Â Â m_rqt[nextDepth].cur.load(*nextContext);<br>
> -Â Â Â Â Â Â Â Â compressInterCU_dist(parentCTU, childGeom);<br>
> +Â Â Â Â Â Â Â Â compressInterCU_dist(parentCTU, childGeom, qp, partIdx * 4 + subPartIdx);<br>
><br>
>Â Â Â Â Â Â Â Â Â // Save best CU and pred data for this sub CU<br>
>Â Â Â Â Â Â Â Â Â splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);<br>
> @@ -741,7 +785,7 @@<br>
>Â Â Â Â Â md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, cuAddr, cuGeom.absPartIdx);<br>
>Â }<br>
><br>
> -void Analysis::compressInterCU_rd0_4(const CUData& parentCTU, const CUGeom& cuGeom)<br>
> +void Analysis::compressInterCU_rd0_4(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp, uint32_t partIdx)<br>
>Â {<br>
>Â Â Â uint32_t depth = cuGeom.depth;<br>
>Â Â Â uint32_t cuAddr = parentCTU.m_cuAddr;<br>
> @@ -752,13 +796,20 @@<br>
>Â Â Â bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);<br>
>Â Â Â uint32_t minDepth = topSkipMinDepth(parentCTU, cuGeom);<br>
><br>
> +Â Â if (m_slice->m_pps->bUseDQP && depth && depth <= m_slice->m_pps->maxCuDQPDepth)<br>
> +Â Â {<br>
> +Â Â Â Â qp = m_qp[depth][partIdx];<br>
> +Â Â Â Â this->setQP(*m_slice, qp);<br>
> +Â Â Â Â qp = x265_clip3(QP_MIN, QP_MAX_SPEC, qp);<br>
> +Â Â }<br>
> +<br>
>Â Â Â if (mightNotSplit && depth >= minDepth)<br>
>Â Â Â {<br>
>Â Â Â Â Â bool bTryIntra = m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames;<br>
><br>
>Â Â Â Â Â /* Compute Merge Cost */<br>
> -Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);<br>
> -Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);<br>
><br>
>Â Â Â Â Â bool earlyskip = false;<br>
> @@ -767,24 +818,24 @@<br>
><br>
>Â Â Â Â Â if (!earlyskip)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N);<br>
><br>
>Â Â Â Â Â Â Â if (m_slice->m_sliceType == B_SLICE)<br>
>Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);<br>
>Â Â Â Â Â Â Â }<br>
><br>
>Â Â Â Â Â Â Â Mode *bestInter = &md.pred[PRED_2Nx2N];<br>
>Â Â Â Â Â Â Â if (m_param->bEnableRectInter)<br>
>Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N);<br>
>Â Â Â Â Â Â Â Â Â if (md.pred[PRED_Nx2N].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_Nx2N];<br>
><br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN);<br>
>Â Â Â Â Â Â Â Â Â if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_2NxN];<br>
> @@ -806,24 +857,24 @@<br>
><br>
>Â Â Â Â Â Â Â Â Â if (bHor)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU);<br>
>Â Â Â Â Â Â Â Â Â Â Â if (md.pred[PRED_2NxnU].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_2NxnU];<br>
><br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD);<br>
>Â Â Â Â Â Â Â Â Â Â Â if (md.pred[PRED_2NxnD].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_2NxnD];<br>
>Â Â Â Â Â Â Â Â Â }<br>
>Â Â Â Â Â Â Â Â Â if (bVer)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N);<br>
>Â Â Â Â Â Â Â Â Â Â Â if (md.pred[PRED_nLx2N].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_nLx2N];<br>
><br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N);<br>
>Â Â Â Â Â Â Â Â Â Â Â if (md.pred[PRED_nRx2N].sa8dCost < bestInter->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â Â Â bestInter = &md.pred[PRED_nRx2N];<br>
> @@ -855,7 +906,7 @@<br>
>Â Â Â Â Â Â Â Â Â if ((bTryIntra && md.bestMode->cu.getQtRootCbf(0)) ||<br>
>Â Â Â Â Â Â Â Â Â Â Â md.bestMode->sa8dCost == MAX_INT64)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkIntraInInter(md.pred[PRED_INTRA], cuGeom);<br>
>Â Â Â Â Â Â Â Â Â Â Â encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA], depth);<br>
> @@ -873,7 +924,7 @@<br>
><br>
>Â Â Â Â Â Â Â Â Â if (bTryIntra || md.bestMode->sa8dCost == MAX_INT64)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkIntraInInter(md.pred[PRED_INTRA], cuGeom);<br>
>Â Â Â Â Â Â Â Â Â Â Â if (md.pred[PRED_INTRA].sa8dCost < md.bestMode->sa8dCost)<br>
>Â Â Â Â Â Â Â Â Â Â Â Â Â md.bestMode = &md.pred[PRED_INTRA];<br>
> @@ -960,7 +1011,7 @@<br>
>Â Â Â Â Â Mode* splitPred = &md.pred[PRED_SPLIT];<br>
>Â Â Â Â Â splitPred->initCosts();<br>
>Â Â Â Â Â CUData* splitCU = &splitPred->cu;<br>
> -Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom, qp);<br>
><br>
>Â Â Â Â Â uint32_t nextDepth = depth + 1;<br>
>Â Â Â Â Â ModeDepth& nd = m_modeDepth[nextDepth];<br>
> @@ -974,7 +1025,7 @@<br>
>Â Â Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â Â Â m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);<br>
>Â Â Â Â Â Â Â Â Â m_rqt[nextDepth].cur.load(*nextContext);<br>
> -Â Â Â Â Â Â Â Â compressInterCU_rd0_4(parentCTU, childGeom);<br>
> +Â Â Â Â Â Â Â Â compressInterCU_rd0_4(parentCTU, childGeom, qp, partIdx * 4 + subPartIdx);<br>
><br>
>Â Â Â Â Â Â Â Â Â // Save best CU and pred data for this sub CU<br>
>Â Â Â Â Â Â Â Â Â splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);<br>
> @@ -1025,7 +1076,7 @@<br>
>Â Â Â Â Â md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, cuAddr, cuGeom.absPartIdx);<br>
>Â }<br>
><br>
> -void Analysis::compressInterCU_rd5_6(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder)<br>
> +void Analysis::compressInterCU_rd5_6(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder, int32_t qp, uint32_t partIdx)<br>
>Â {<br>
>Â Â Â uint32_t depth = cuGeom.depth;<br>
>Â Â Â ModeDepth& md = m_modeDepth[depth];<br>
> @@ -1034,14 +1085,21 @@<br>
>Â Â Â bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);<br>
>Â Â Â bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);<br>
><br>
> +Â Â if (m_slice->m_pps->bUseDQP && depth && depth <= m_slice->m_pps->maxCuDQPDepth)<br>
> +Â Â {<br>
> +Â Â Â Â qp = m_qp[depth][partIdx];<br>
> +Â Â Â Â this->setQP(*m_slice, qp);<br>
> +Â Â Â Â qp = x265_clip3(QP_MIN, QP_MAX_SPEC, qp);<br>
> +Â Â }<br>
> +<br>
>Â Â Â if (m_param->analysisMode == X265_ANALYSIS_LOAD)<br>
>Â Â Â {<br>
>     uint8_t* reuseDepth = &m_reuseInterDataCTU->depth[parentCTU.m_cuAddr * parentCTU.m_numPartitions];<br>
>     uint8_t* reuseModes = &m_reuseInterDataCTU->modes[parentCTU.m_cuAddr * parentCTU.m_numPartitions];<br>
>Â Â Â Â Â if (mightNotSplit && depth == reuseDepth[zOrder] && zOrder == cuGeom.absPartIdx && reuseModes[zOrder] == MODE_SKIP)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);<br>
> -Â Â Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom, true);<br>
><br>
>Â Â Â Â Â Â Â if (m_bTryLossless)<br>
> @@ -1060,20 +1118,20 @@<br>
><br>
>Â Â Â if (mightNotSplit)<br>
>Â Â Â {<br>
> -Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);<br>
> -Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);<br>
> +Â Â Â Â md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom, false);<br>
>Â Â Â Â Â bool earlySkip = m_param->bEnableEarlySkip && md.bestMode && !md.bestMode->cu.getQtRootCbf(0);<br>
><br>
>Â Â Â Â Â if (!earlySkip)<br>
>Â Â Â Â Â {<br>
> -Â Â Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, false);<br>
>Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);<br>
><br>
>Â Â Â Â Â Â Â if (m_slice->m_sliceType == B_SLICE)<br>
>Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);<br>
>Â Â Â Â Â Â Â Â Â if (md.pred[PRED_BIDIR].sa8dCost < MAX_INT64)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> @@ -1084,11 +1142,11 @@<br>
><br>
>Â Â Â Â Â Â Â if (m_param->bEnableRectInter)<br>
>Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N, false);<br>
>Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_Nx2N], cuGeom.depth);<br>
><br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, false);<br>
>Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);<br>
>Â Â Â Â Â Â Â }<br>
> @@ -1111,21 +1169,21 @@<br>
><br>
>Â Â Â Â Â Â Â Â Â if (bHor)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU, bMergeOnly);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_2NxnU], cuGeom.depth);<br>
><br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, bMergeOnly);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);<br>
>Â Â Â Â Â Â Â Â Â }<br>
>Â Â Â Â Â Â Â Â Â if (bVer)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N, bMergeOnly);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_nLx2N], cuGeom.depth);<br>
><br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, bMergeOnly);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);<br>
>Â Â Â Â Â Â Â Â Â }<br>
> @@ -1133,13 +1191,13 @@<br>
><br>
>Â Â Â Â Â Â Â if (m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames)<br>
>Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N, NULL, NULL);<br>
>Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA], depth);<br>
><br>
>Â Â Â Â Â Â Â Â Â if (cuGeom.log2CUSize == 3 && m_slice->m_sps->quadtreeTULog2MinSize < 3)<br>
>Â Â Â Â Â Â Â Â Â {<br>
> -Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â Â Â Â Â Â Â md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkIntra(md.pred[PRED_INTRA_NxN], cuGeom, SIZE_NxN, NULL, NULL);<br>
>Â Â Â Â Â Â Â Â Â Â Â checkBestMode(md.pred[PRED_INTRA_NxN], depth);<br>
>Â Â Â Â Â Â Â Â Â }<br>
> @@ -1159,7 +1217,7 @@<br>
>Â Â Â Â Â Mode* splitPred = &md.pred[PRED_SPLIT];<br>
>Â Â Â Â Â splitPred->initCosts();<br>
>Â Â Â Â Â CUData* splitCU = &splitPred->cu;<br>
> -Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom);<br>
> +Â Â Â Â splitCU->initSubCU(parentCTU, cuGeom, qp);<br>
><br>
>Â Â Â Â Â uint32_t nextDepth = depth + 1;<br>
>Â Â Â Â Â ModeDepth& nd = m_modeDepth[nextDepth];<br>
> @@ -1173,7 +1231,7 @@<br>
>Â Â Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â Â Â m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);<br>
>Â Â Â Â Â Â Â Â Â m_rqt[nextDepth].cur.load(*nextContext);<br>
> -Â Â Â Â Â Â Â Â compressInterCU_rd5_6(parentCTU, childGeom, zOrder);<br>
> +Â Â Â Â Â Â Â Â compressInterCU_rd5_6(parentCTU, childGeom, zOrder, qp, partIdx * 4 + subPartIdx);<br>
><br>
>Â Â Â Â Â Â Â Â Â // Save best CU and pred data for this sub CU<br>
>Â Â Â Â Â Â Â Â Â splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);<br>
> @@ -1913,7 +1971,7 @@<br>
>Â Â Â return false;<br>
>Â }<br>
><br>
> -int Analysis::calculateQpforCuSize(CUData& ctu, const CUGeom& cuGeom)<br>
> +int Analysis::calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom)<br>
>Â {<br>
>Â Â Â uint32_t ctuAddr = ctu.m_cuAddr;<br>
>Â Â Â FrameData& curEncData = *m_frame->m_encData;<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/encoder/analysis.h<br>
> --- a/source/encoder/analysis.h    Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/encoder/analysis.h    Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -90,6 +90,7 @@<br>
>Â Â Â void processPmode(PMODE& pmode, Analysis& slave);<br>
><br>
>Â Â Â ModeDepth m_modeDepth[NUM_CU_DEPTH];<br>
> +Â Â int*Â Â Â m_qp[NUM_CU_DEPTH];<br>
>   bool   m_bTryLossless;<br>
>   bool   m_bChromaSa8d;<br>
><br>
> @@ -109,12 +110,12 @@<br>
>Â Â Â uint32_t*Â Â Â Â Â Â m_reuseBestMergeCand;<br>
><br>
>Â Â Â /* full analysis for an I-slice CU */<br>
> -Â Â void compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder);<br>
> +Â Â void compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder, int32_t qpDepth, uint32_t partIdx);<br>
><br>
>Â Â Â /* full analysis for a P or B slice CU */<br>
> -Â Â void compressInterCU_dist(const CUData& parentCTU, const CUGeom& cuGeom);<br>
> -Â Â void compressInterCU_rd0_4(const CUData& parentCTU, const CUGeom& cuGeom);<br>
> -Â Â void compressInterCU_rd5_6(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder);<br>
> +Â Â void compressInterCU_dist(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qpDepth, uint32_t partIdx);<br>
> +Â Â void compressInterCU_rd0_4(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qpDepth, uint32_t partIdx);<br>
> +Â Â void compressInterCU_rd5_6(const CUData& parentCTU, const CUGeom& cuGeom, uint32_t &zOrder, int32_t qpDepth, uint32_t partIdx);<br>
><br>
>Â Â Â /* measure merge and skip */<br>
>Â Â Â void checkMerge2Nx2N_rd0_4(Mode& skip, Mode& merge, const CUGeom& cuGeom);<br>
> @@ -139,7 +140,7 @@<br>
>Â Â Â /* generate residual and recon pixels for an entire CTU recursively (RD0) */<br>
>Â Â Â void encodeResidue(const CUData& parentCTU, const CUGeom& cuGeom);<br>
><br>
> -Â Â int calculateQpforCuSize(CUData& ctu, const CUGeom& cuGeom);<br>
> +Â Â int calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom);<br>
><br>
>Â Â Â /* check whether current mode is the new best */<br>
>Â Â Â inline void checkBestMode(Mode& mode, uint32_t depth)<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/encoder/encoder.cpp<br>
> --- a/source/encoder/encoder.cpp   Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/encoder/encoder.cpp   Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -1557,15 +1557,12 @@<br>
>Â Â Â bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;<br>
><br>
>Â Â Â if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))<br>
> -Â Â {<br>
>Â Â Â Â Â pps->bUseDQP = true;<br>
> -Â Â Â Â pps->maxCuDQPDepth = 0; /* TODO: make configurable? */<br>
> -Â Â }<br>
>Â Â Â else<br>
> -Â Â {<br>
>Â Â Â Â Â pps->bUseDQP = false;<br>
> -Â Â Â Â pps->maxCuDQPDepth = 0;<br>
> -Â Â }<br>
> +<br>
> +Â Â pps->maxCuDQPDepth = g_log2Size[m_param->maxCUSize] - g_log2Size[m_param->rc.QGSize];<br>
> +Â Â X265_CHECK(pps->maxCuDQPDepth <= 2, "max CU DQP depth cannot be greater than 2");<br>
><br>
>Â Â Â pps->chromaQpOffset[0] = m_param->cbQpOffset;<br>
>Â Â Â pps->chromaQpOffset[1] = m_param->crQpOffset;<br>
> @@ -1788,6 +1785,22 @@<br>
>Â Â Â Â Â p->analysisMode = X265_ANALYSIS_OFF;<br>
>Â Â Â Â Â x265_log(p, X265_LOG_WARNING, "Analysis save and load mode not supported for distributed mode analysis\n");<br>
>Â Â Â }<br>
> +<br>
> +Â Â bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;<br>
> +Â Â if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))<br>
> +Â Â {<br>
> +Â Â Â Â if (p->rc.QGSize < X265_MAX(16, p->minCUSize))<br>
> +Â Â Â Â {<br>
> +Â Â Â Â Â Â p->rc.QGSize = X265_MAX(16, p->minCUSize);<br>
> +Â Â Â Â Â Â x265_log(p, X265_LOG_WARNING, "QGSize should be greater than or equal to 16 and minCUSize, setting QGSize = %d \n", p->rc.QGSize);<br>
<br>
</div></div>trailing white-space<br>
<div><div><br>
> +Â Â Â Â }<br>
> +<br>
> +Â Â Â Â if (p->rc.QGSize > p->maxCUSize)<br>
> +Â Â Â Â {<br>
> +Â Â Â Â Â Â p->rc.QGSize = p->maxCUSize;<br>
> +Â Â Â Â Â Â x265_log(p, X265_LOG_WARNING, "QGSize should be less than or equal to maxCUSize, setting QGSize = %d \n", p->rc.QGSize);<br>
> +Â Â Â Â }<br>
> +Â Â }<br>
>Â }<br>
><br>
>Â void Encoder::allocAnalysis(x265_analysis_data* analysis)<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/encoder/frameencoder.cpp<br>
> --- a/source/encoder/frameencoder.cpp Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/encoder/frameencoder.cpp Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -852,9 +852,7 @@<br>
>Â Â Â Â Â if (m_param->rc.aqMode || bIsVbv)<br>
>Â Â Â Â Â {<br>
>Â Â Â Â Â Â Â int qp = calcQpForCu(cuAddr, curEncData.m_cuStat[cuAddr].baseQp);<br>
> -Â Â Â Â Â Â tld.analysis.setQP(*slice, qp);<br>
>Â Â Â Â Â Â Â qp = x265_clip3(QP_MIN, QP_MAX_SPEC, qp);<br>
> -Â Â Â Â Â Â ctu->setQPSubParts((int8_t)qp, 0, 0);<br>
>Â Â Â Â Â Â Â curEncData.m_rowStat[row].sumQpAq += qp;<br>
>Â Â Â Â Â }<br>
>Â Â Â Â Â else<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/x265.h<br>
> --- a/source/x265.h  Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/x265.h  Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -988,6 +988,12 @@<br>
>Â Â Â Â Â /* Enable stricter conditions to check bitrate deviations in CBR mode. May compromise<br>
>Â Â Â Â Â Â * quality to maintain bitrate adherence */<br>
>Â Â Â Â Â int bStrictCbr;<br>
> +<br>
> +Â Â Â Â /* Enable adaptive quantization at CU granularity. This parameter specifies<br>
> +Â Â Â Â Â * the minimum CU size at which QP can be adjusted, i.e. Quantization Group<br>
> +Â Â Â Â Â * (QG) size. Allowed values are 64, 32, 16 provided it falls within the<br>
> +Â Â Â Â Â * inclusuve range [maxCUSize, minCUSize]. Experimental, default: maxCUSize*/<br>
> +Â Â Â Â uint32_t QGSize;<br>
<br>
</div></div>in our camelCase style this would be qgSize<br>
<span><br>
>Â Â Â } rc;<br>
><br>
>Â Â Â /*== Video Usability Information ==*/<br>
> diff -r 335c728bbd62 -r d6e059bd8a9c source/x265cli.h<br>
> --- a/source/x265cli.h    Fri Apr 03 14:27:32 2015 -0500<br>
> +++ b/source/x265cli.h    Mon Mar 23 14:23:42 2015 +0530<br>
> @@ -205,6 +205,7 @@<br>
>   { "strict-cbr",      no_argument, NULL, 0 },<br>
>   { "temporal-layers",   no_argument, NULL, 0 },<br>
>   { "no-temporal-layers",  no_argument, NULL, 0 },<br>
> +  { "qg-size",    required_argument, NULL, 0 },<br>
<br>
</span>w/s<br>
<span><br>
>Â Â Â { 0, 0, 0, 0 },<br>
>Â Â Â { 0, 0, 0, 0 },<br>
>Â Â Â { 0, 0, 0, 0 },<br>
> @@ -352,6 +353,7 @@<br>
>Â Â Â H0("Â Â --analysis-file <filename>Â Â Specify file name used for either dumping or reading analysis data.\n");<br>
>Â Â Â H0("Â Â --aq-mode <integer>Â Â Â Â Â Â Mode for Adaptive Quantization - 0:none 1:uniform AQ 2:auto variance. Default %d\n", param->rc.aqMode);<br>
>Â Â Â H0("Â Â --aq-strength <float>Â Â Â Â Â Reduces blocking and blurring in flat and textured areas (0 to 3.0). Default %.2f\n", param->rc.aqStrength);<br>
> +Â Â H0("Â Â --qg-size <float>Â Â Â Â Â Specifies the size of the quantization group (64, 32, 16). Default %d\n", param->rc.QGSize);<br>
<br>
</span>float? alignment<br>
<span><br>
>   H0("  --[no-]cutree         Enable cutree for Adaptive Quantization. Default %s\n", OPT(param->rc.cuTree));<br>
>Â Â Â H1("Â Â --ipratio <float>Â Â Â Â Â Â Â QP factor between I and P. Default %.2f\n", param->rc.ipFactor);<br>
>Â Â Â H1("Â Â --pbratio <float>Â Â Â Â Â Â Â QP factor between P and B. Default %.2f\n", param->rc.pbFactor);<br>
<br>
</span>docs, X265_BUILD<br>
<span><font color="#888888"><br>
--<br>
Steve Borho<br>
<br>
_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org" target="_blank">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br>
<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>