<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"><style>body { line-height: 1.5; }blockquote { margin-top: 0px; margin-bottom: 0px; margin-left: 0.5em; }p { margin-top: 0px; margin-bottom: 0px; }body { font-size: 10.5pt; font-family: 宋体; color: rgb(0, 0, 0); line-height: 1.5; }</style></head><body>
<div><span></span><span style="background-color: rgba(0, 0, 0, 0);">I believe that the source code "</span><span style="background-color: rgba(0, 0, 0, 0); font-size: 10.5pt; line-height: 1.5;">if (modeHor)..." in fuction "</span><span style="background-color: rgba(0, 0, 0, 0); font-size: 10.5pt; line-height: 1.5;">void all_angs_pred_c(pixel *dest, pixel *refPix, pixel *filtPix, int bLuma)"</span><span style="font-size: 10.5pt; line-height: 1.5; background-color: rgba(0, 0, 0, 0);"> </span><span style="font-size: 10.5pt; line-height: 1.5; background-color: rgba(0, 0, 0, 0);">is redundant and it makes some bugs in intra prediction.</span></div><div><br></div><hr style="width: 210px; height: 1px;" color="#b5c4df" size="1" align="left">
<div><span><div style="MARGIN: 10px; FONT-FAMILY: verdana; FONT-SIZE: 10pt"><p class="MsoNormal" style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 宋体; line-height: normal;"><span style="font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Best Regards,</span><o:p></o:p></p><p class="MsoNormal" style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 宋体; line-height: normal;"><span style="font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Peking University </span><o:p></o:p></p><p class="MsoNormal" style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 宋体; line-height: normal;"><span style="font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Yangang Cai</span></p></div></span></div>
<blockquote style="margin-top: 0px; margin-bottom: 0px; margin-left: 0.5em;"><div> </div><div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm"><div style="PADDING-RIGHT: 8px; PADDING-LEFT: 8px; FONT-SIZE: 12px;FONT-FAMILY:tahoma;COLOR:#000000; BACKGROUND: #efefef; PADDING-BOTTOM: 8px; PADDING-TOP: 8px"><div><b>From:</b> <a href="mailto:x265-devel-request@videolan.org">x265-devel-request</a></div><div><b>Date:</b> 2015-12-03 01:29</div><div><b>To:</b> <a href="mailto:x265-devel@videolan.org">x265-devel</a></div><div><b>Subject:</b> x265-devel Digest, Vol 31, Issue 7</div></div></div><div><div>Send x265-devel mailing list submissions to</div>
<div> x265-devel@videolan.org</div>
<div> </div>
<div>To subscribe or unsubscribe via the World Wide Web, visit</div>
<div> https://mailman.videolan.org/listinfo/x265-devel</div>
<div>or, via email, send a message with subject or body 'help' to</div>
<div> x265-devel-request@videolan.org</div>
<div> </div>
<div>You can reach the person managing the list at</div>
<div> x265-devel-owner@videolan.org</div>
<div> </div>
<div>When replying, please edit your Subject line so it is more specific</div>
<div>than "Re: Contents of x265-devel digest..."</div>
<div> </div>
<div> </div>
<div>Today's Topics:</div>
<div> </div>
<div> 1. [PATCH 11 of 15] sao: split SAO Left reference pixel buffer</div>
<div> into row base (Min Chen)</div>
<div> 2. [PATCH 12 of 15] sao: new CU level process function (Min Chen)</div>
<div> 3. [PATCH 13 of 15] sao: avoid thread conflict on offsetEo and</div>
<div> offsetBo (Min Chen)</div>
<div> 4. [PATCH 14 of 15] sao: reduce address operators by split into</div>
<div> Luma and Chroma path (Min Chen)</div>
<div> </div>
<div> </div>
<div>----------------------------------------------------------------------</div>
<div> </div>
<div>Message: 1</div>
<div>Date: Wed, 02 Dec 2015 11:28:34 -0600</div>
<div>From: Min Chen <chenm003@163.com></div>
<div>To: x265-devel@videolan.org</div>
<div>Subject: [x265] [PATCH 11 of 15] sao: split SAO Left reference pixel</div>
<div> buffer into row base</div>
<div>Message-ID: <3a423fcb4b4089de2c05.1449077314@chen-PC></div>
<div>Content-Type: text/plain; charset="us-ascii"</div>
<div> </div>
<div># HG changeset patch</div>
<div># User Min Chen <chenm003@163.com></div>
<div># Date 1449076371 21600</div>
<div># Node ID 3a423fcb4b4089de2c05a9067556f20a6fca0d1b</div>
<div># Parent 82f6a10f44b88400f0f875025b9e8b6caff3acd3</div>
<div>sao: split SAO Left reference pixel buffer into row base</div>
<div>---</div>
<div> source/encoder/sao.cpp | 35 +++++++++++++++++++++++++----------</div>
<div> source/encoder/sao.h | 4 ++--</div>
<div> 2 files changed, 27 insertions(+), 12 deletions(-)</div>
<div> </div>
<div>diff -r 82f6a10f44b8 -r 3a423fcb4b40 source/encoder/sao.cpp</div>
<div>--- a/source/encoder/sao.cpp Wed Dec 02 11:12:48 2015 -0600</div>
<div>+++ b/source/encoder/sao.cpp Wed Dec 02 11:12:51 2015 -0600</div>
<div>@@ -87,8 +87,12 @@</div>
<div> m_tmpU[0] = NULL;</div>
<div> m_tmpU[1] = NULL;</div>
<div> m_tmpU[2] = NULL;</div>
<div>- m_tmpL1 = NULL;</div>
<div>- m_tmpL2 = NULL;</div>
<div>+ m_tmpL1[0] = NULL;</div>
<div>+ m_tmpL1[1] = NULL;</div>
<div>+ m_tmpL1[2] = NULL;</div>
<div>+ m_tmpL2[0] = NULL;</div>
<div>+ m_tmpL2[1] = NULL;</div>
<div>+ m_tmpL2[2] = NULL;</div>
<div> </div>
<div> m_depthSaoRate[0][0] = 0;</div>
<div> m_depthSaoRate[0][1] = 0;</div>
<div>@@ -116,11 +120,12 @@</div>
<div> </div>
<div> CHECKED_MALLOC(m_clipTableBase, pixel, maxY + 2 * rangeExt);</div>
<div> </div>
<div>- CHECKED_MALLOC(m_tmpL1, pixel, g_maxCUSize + 1);</div>
<div>- CHECKED_MALLOC(m_tmpL2, pixel, g_maxCUSize + 1);</div>
<div> </div>
<div> for (int i = 0; i < 3; i++)</div>
<div> {</div>
<div>+ CHECKED_MALLOC(m_tmpL1[i], pixel, g_maxCUSize + 1);</div>
<div>+ CHECKED_MALLOC(m_tmpL2[i], pixel, g_maxCUSize + 1);</div>
<div>+</div>
<div> // SAO asm code will read 1 pixel before and after, so pad by 2</div>
<div> // NOTE: m_param->sourceWidth+2 enough, to avoid condition check in copySaoAboveRef(), I alloc more up to 63 bytes in here</div>
<div> CHECKED_MALLOC(m_tmpU[i], pixel, m_numCuInWidth * g_maxCUSize + 2);</div>
<div>@@ -182,11 +187,21 @@</div>
<div> {</div>
<div> X265_FREE_ZERO(m_clipTableBase);</div>
<div> </div>
<div>- X265_FREE_ZERO(m_tmpL1);</div>
<div>- X265_FREE_ZERO(m_tmpL2);</div>
<div> </div>
<div> for (int i = 0; i < 3; i++)</div>
<div> {</div>
<div>+ if (m_tmpL1[i])</div>
<div>+ {</div>
<div>+ X265_FREE(m_tmpL1[i]);</div>
<div>+ m_tmpL1[i] = NULL;</div>
<div>+ }</div>
<div>+</div>
<div>+ if (m_tmpL2[i])</div>
<div>+ {</div>
<div>+ X265_FREE(m_tmpL2[i]);</div>
<div>+ m_tmpL2[i] = NULL;</div>
<div>+ }</div>
<div>+</div>
<div> if (m_tmpU[i])</div>
<div> {</div>
<div> X265_FREE(m_tmpU[i] - 1);</div>
<div>@@ -307,7 +322,7 @@</div>
<div> </div>
<div> memset(_upBuff1 + MAX_CU_SIZE, 0, 2 * sizeof(int8_t)); /* avoid valgrind uninit warnings */</div>
<div> </div>
<div>- tmpL = m_tmpL1;</div>
<div>+ tmpL = m_tmpL1[plane];</div>
<div> tmpU = &(m_tmpU[plane][lpelx]);</div>
<div> </div>
<div> switch (typeIdx)</div>
<div>@@ -607,7 +622,7 @@</div>
<div> </div>
<div> for (int i = 0; i < ctuHeight + 1; i++)</div>
<div> {</div>
<div>- m_tmpL1[i] = rec[0];</div>
<div>+ m_tmpL1[plane][i] = rec[0];</div>
<div> rec += stride;</div>
<div> }</div>
<div> </div>
<div>@@ -623,7 +638,7 @@</div>
<div> rec = reconPic->getPlaneAddr(plane, addr);</div>
<div> for (int i = 0; i < ctuHeight + 1; i++)</div>
<div> {</div>
<div>- m_tmpL2[i] = rec[ctuWidth - 1];</div>
<div>+ m_tmpL2[plane][i] = rec[ctuWidth - 1];</div>
<div> rec += stride;</div>
<div> }</div>
<div> }</div>
<div>@@ -652,7 +667,7 @@</div>
<div> }</div>
<div> processSaoCu(addr, typeIdx, plane);</div>
<div> }</div>
<div>- std::swap(m_tmpL1, m_tmpL2);</div>
<div>+ std::swap(m_tmpL1[plane], m_tmpL2[plane]);</div>
<div> }</div>
<div> }</div>
<div> </div>
<div>diff -r 82f6a10f44b8 -r 3a423fcb4b40 source/encoder/sao.h</div>
<div>--- a/source/encoder/sao.h Wed Dec 02 11:12:48 2015 -0600</div>
<div>+++ b/source/encoder/sao.h Wed Dec 02 11:12:51 2015 -0600</div>
<div>@@ -93,8 +93,8 @@</div>
<div> pixel* m_clipTableBase;</div>
<div> </div>
<div> pixel* m_tmpU[3];</div>
<div>- pixel* m_tmpL1;</div>
<div>- pixel* m_tmpL2;</div>
<div>+ pixel* m_tmpL1[3];</div>
<div>+ pixel* m_tmpL2[3];</div>
<div> </div>
<div> public:</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>Message: 2</div>
<div>Date: Wed, 02 Dec 2015 11:28:35 -0600</div>
<div>From: Min Chen <chenm003@163.com></div>
<div>To: x265-devel@videolan.org</div>
<div>Subject: [x265] [PATCH 12 of 15] sao: new CU level process function</div>
<div>Message-ID: <b1c261378db29a1988d8.1449077315@chen-PC></div>
<div>Content-Type: text/plain; charset="us-ascii"</div>
<div> </div>
<div># HG changeset patch</div>
<div># User Min Chen <chenm003@163.com></div>
<div># Date 1449076374 21600</div>
<div># Node ID b1c261378db29a1988d8e27c5eabe1a76821f83d</div>
<div># Parent 3a423fcb4b4089de2c05a9067556f20a6fca0d1b</div>
<div>sao: new CU level process function</div>
<div>---</div>
<div> source/encoder/framefilter.cpp | 13 +++++--</div>
<div> source/encoder/sao.cpp | 68 ++++++++++++++++++++++++++++++++++++++++</div>
<div> source/encoder/sao.h | 1 +</div>
<div> 3 files changed, 78 insertions(+), 4 deletions(-)</div>
<div> </div>
<div>diff -r 3a423fcb4b40 -r b1c261378db2 source/encoder/framefilter.cpp</div>
<div>--- a/source/encoder/framefilter.cpp Wed Dec 02 11:12:51 2015 -0600</div>
<div>+++ b/source/encoder/framefilter.cpp Wed Dec 02 11:12:54 2015 -0600</div>
<div>@@ -541,19 +541,24 @@</div>
<div> {</div>
<div> FrameData& encData = *m_frame->m_encData;</div>
<div> SAOParam* saoParam = encData.m_saoParam;</div>
<div>+ uint32_t numCols = encData.m_slice->m_sps->numCuInWidth;</div>
<div> </div>
<div> if (saoParam->bSaoFlag[0])</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitRow(saoParam->ctuParam[0], row, 0);</div>
<div>+ {</div>
<div>+ for(uint32_t col = 0; col < numCols; col++)</div>
<div>+ m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[0], row, col, 0);</div>
<div>+ }</div>
<div> </div>
<div> if (saoParam->bSaoFlag[1])</div>
<div> {</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitRow(saoParam->ctuParam[1], row, 1);</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitRow(saoParam->ctuParam[2], row, 2);</div>
<div>+ for(uint32_t col = 0; col < numCols; col++)</div>
<div>+ m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[1], row, col, 1);</div>
<div>+ for(uint32_t col = 0; col < numCols; col++)</div>
<div>+ m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[2], row, col, 2);</div>
<div> }</div>
<div> </div>
<div> if (encData.m_slice->m_pps->bTransquantBypassEnabled)</div>
<div> {</div>
<div>- uint32_t numCols = encData.m_slice->m_sps->numCuInWidth;</div>
<div> uint32_t lineStartCUAddr = row * numCols;</div>
<div> </div>
<div> const CUGeom* cuGeoms = m_frameEncoder->m_cuGeoms;</div>
<div>diff -r 3a423fcb4b40 -r b1c261378db2 source/encoder/sao.cpp</div>
<div>--- a/source/encoder/sao.cpp Wed Dec 02 11:12:51 2015 -0600</div>
<div>+++ b/source/encoder/sao.cpp Wed Dec 02 11:12:54 2015 -0600</div>
<div>@@ -671,6 +671,74 @@</div>
<div> }</div>
<div> }</div>
<div> </div>
<div>+/* Process SAO unit */</div>
<div>+void SAO::processSaoUnitCu(SaoCtuParam* ctuParam, int idxY, int idxX, int plane)</div>
<div>+{</div>
<div>+ PicYuv* reconPic = m_frame->m_reconPic;</div>
<div>+ intptr_t stride = plane ? reconPic->m_strideC : reconPic->m_stride;</div>
<div>+ uint32_t picWidth = m_param->sourceWidth;</div>
<div>+ int ctuWidth = g_maxCUSize;</div>
<div>+ int ctuHeight = g_maxCUSize;</div>
<div>+</div>
<div>+ if (plane)</div>
<div>+ {</div>
<div>+ picWidth >>= m_hChromaShift;</div>
<div>+ ctuWidth >>= m_hChromaShift;</div>
<div>+ ctuHeight >>= m_vChromaShift;</div>
<div>+ }</div>
<div>+</div>
<div>+ int addr = idxY * m_numCuInWidth + idxX;</div>
<div>+ pixel* rec = reconPic->getPlaneAddr(plane, addr);</div>
<div>+</div>
<div>+ if (idxX == 0)</div>
<div>+ {</div>
<div>+ for (int i = 0; i < ctuHeight + 1; i++)</div>
<div>+ {</div>
<div>+ m_tmpL1[plane][i] = rec[0];</div>
<div>+ rec += stride;</div>
<div>+ }</div>
<div>+ }</div>
<div>+</div>
<div>+ bool mergeLeftFlag = (ctuParam[addr].mergeMode == SAO_MERGE_LEFT);</div>
<div>+ int typeIdx = ctuParam[addr].typeIdx;</div>
<div>+</div>
<div>+ if (idxX != (m_numCuInWidth - 1))</div>
<div>+ {</div>
<div>+ rec = reconPic->getPlaneAddr(plane, addr);</div>
<div>+ for (int i = 0; i < ctuHeight + 1; i++)</div>
<div>+ {</div>
<div>+ m_tmpL2[plane][i] = rec[ctuWidth - 1];</div>
<div>+ rec += stride;</div>
<div>+ }</div>
<div>+ }</div>
<div>+</div>
<div>+ if (typeIdx >= 0)</div>
<div>+ {</div>
<div>+ if (!mergeLeftFlag)</div>
<div>+ {</div>
<div>+ if (typeIdx == SAO_BO)</div>
<div>+ {</div>
<div>+ memset(m_offsetBo, 0, sizeof(m_offsetBo));</div>
<div>+</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ m_offsetBo[((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div>+ }</div>
<div>+ else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div>+ {</div>
<div>+ int offset[NUM_EDGETYPE];</div>
<div>+ offset[0] = 0;</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ offset[i + 1] = ctuParam[addr].offset[i] << SAO_BIT_INC;</div>
<div>+</div>
<div>+ for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>+ m_offsetEo[edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ }</div>
<div>+ }</div>
<div>+ processSaoCu(addr, typeIdx, plane);</div>
<div>+ }</div>
<div>+ std::swap(m_tmpL1[plane], m_tmpL2[plane]);</div>
<div>+}</div>
<div>+</div>
<div> void SAO::copySaoUnit(SaoCtuParam* saoUnitDst, const SaoCtuParam* saoUnitSrc)</div>
<div> {</div>
<div> saoUnitDst->mergeMode = saoUnitSrc->mergeMode;</div>
<div>diff -r 3a423fcb4b40 -r b1c261378db2 source/encoder/sao.h</div>
<div>--- a/source/encoder/sao.h Wed Dec 02 11:12:51 2015 -0600</div>
<div>+++ b/source/encoder/sao.h Wed Dec 02 11:12:54 2015 -0600</div>
<div>@@ -132,6 +132,7 @@</div>
<div> // CTU-based SAO process without slice granularity</div>
<div> void processSaoCu(int addr, int typeIdx, int plane);</div>
<div> void processSaoUnitRow(SaoCtuParam* ctuParam, int idxY, int plane);</div>
<div>+ void processSaoUnitCu(SaoCtuParam* ctuParam, int idxY, int idxX, int plane);</div>
<div> </div>
<div> void copySaoUnit(SaoCtuParam* saoUnitDst, const SaoCtuParam* saoUnitSrc);</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>Message: 3</div>
<div>Date: Wed, 02 Dec 2015 11:28:36 -0600</div>
<div>From: Min Chen <chenm003@163.com></div>
<div>To: x265-devel@videolan.org</div>
<div>Subject: [x265] [PATCH 13 of 15] sao: avoid thread conflict on</div>
<div> offsetEo and offsetBo</div>
<div>Message-ID: <a3a9660c91b8eeb8f708.1449077316@chen-PC></div>
<div>Content-Type: text/plain; charset="us-ascii"</div>
<div> </div>
<div># HG changeset patch</div>
<div># User Min Chen <chenm003@163.com></div>
<div># Date 1449076377 21600</div>
<div># Node ID a3a9660c91b8eeb8f70869fc4022f939c01023f0</div>
<div># Parent b1c261378db29a1988d8e27c5eabe1a76821f83d</div>
<div>sao: avoid thread conflict on offsetEo and offsetBo</div>
<div>---</div>
<div> source/encoder/framefilter.cpp | 12 +++++-------</div>
<div> source/encoder/sao.cpp | 38 ++++++++++++++++++++------------------</div>
<div> source/encoder/sao.h | 4 ++--</div>
<div> 3 files changed, 27 insertions(+), 27 deletions(-)</div>
<div> </div>
<div>diff -r b1c261378db2 -r a3a9660c91b8 source/encoder/framefilter.cpp</div>
<div>--- a/source/encoder/framefilter.cpp Wed Dec 02 11:12:54 2015 -0600</div>
<div>+++ b/source/encoder/framefilter.cpp Wed Dec 02 11:12:57 2015 -0600</div>
<div>@@ -543,18 +543,16 @@</div>
<div> SAOParam* saoParam = encData.m_saoParam;</div>
<div> uint32_t numCols = encData.m_slice->m_sps->numCuInWidth;</div>
<div> </div>
<div>- if (saoParam->bSaoFlag[0])</div>
<div>+ for(uint32_t col = 0; col < numCols; col++)</div>
<div> {</div>
<div>- for(uint32_t col = 0; col < numCols; col++)</div>
<div>+ if (saoParam->bSaoFlag[0])</div>
<div> m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[0], row, col, 0);</div>
<div>- }</div>
<div> </div>
<div>- if (saoParam->bSaoFlag[1])</div>
<div>- {</div>
<div>- for(uint32_t col = 0; col < numCols; col++)</div>
<div>+ if (saoParam->bSaoFlag[1])</div>
<div>+ {</div>
<div> m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[1], row, col, 1);</div>
<div>- for(uint32_t col = 0; col < numCols; col++)</div>
<div> m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[2], row, col, 2);</div>
<div>+ }</div>
<div> }</div>
<div> </div>
<div> if (encData.m_slice->m_pps->bTransquantBypassEnabled)</div>
<div>diff -r b1c261378db2 -r a3a9660c91b8 source/encoder/sao.cpp</div>
<div>--- a/source/encoder/sao.cpp Wed Dec 02 11:12:54 2015 -0600</div>
<div>+++ b/source/encoder/sao.cpp Wed Dec 02 11:12:57 2015 -0600</div>
<div>@@ -325,6 +325,8 @@</div>
<div> tmpL = m_tmpL1[plane];</div>
<div> tmpU = &(m_tmpU[plane][lpelx]);</div>
<div> </div>
<div>+ int8_t* offsetEo = m_offsetEo[plane];</div>
<div>+</div>
<div> switch (typeIdx)</div>
<div> {</div>
<div> case SAO_EO_0: // dir: -</div>
<div>@@ -343,7 +345,7 @@</div>
<div> int edgeType = signRight + signLeft + 2;</div>
<div> signLeft = -signRight;</div>
<div> </div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> }</div>
<div> </div>
<div> rec += stride;</div>
<div>@@ -368,7 +370,7 @@</div>
<div> row1LastPxl = rec[stride + ctuWidth - 1];</div>
<div> }</div>
<div> </div>
<div>- primitives.saoCuOrgE0(rec, m_offsetEo, ctuWidth, signLeft1, stride);</div>
<div>+ primitives.saoCuOrgE0(rec, offsetEo, ctuWidth, signLeft1, stride);</div>
<div> </div>
<div> if (!lpelx)</div>
<div> {</div>
<div>@@ -407,7 +409,7 @@</div>
<div> int edgeType = signDown + upBuff1[x] + 2;</div>
<div> upBuff1[x] = -signDown;</div>
<div> </div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> }</div>
<div> </div>
<div> rec += stride;</div>
<div>@@ -420,11 +422,11 @@</div>
<div> int diff = (endY - startY) % 2;</div>
<div> for (y = startY; y < endY - diff; y += 2)</div>
<div> {</div>
<div>- primitives.saoCuOrgE1_2Rows(rec, upBuff1, m_offsetEo, stride, ctuWidth);</div>
<div>+ primitives.saoCuOrgE1_2Rows(rec, upBuff1, offsetEo, stride, ctuWidth);</div>
<div> rec += 2 * stride;</div>
<div> }</div>
<div> if (diff & 1)</div>
<div>- primitives.saoCuOrgE1(rec, upBuff1, m_offsetEo, stride, ctuWidth);</div>
<div>+ primitives.saoCuOrgE1(rec, upBuff1, offsetEo, stride, ctuWidth);</div>
<div> }</div>
<div> </div>
<div> break;</div>
<div>@@ -474,7 +476,7 @@</div>
<div> int8_t signDown = signOf(rec[x] - rec[x + stride + 1]);</div>
<div> int edgeType = signDown + upBuff1[x] + 2;</div>
<div> upBufft[x + 1] = -signDown;</div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> }</div>
<div> </div>
<div> std::swap(upBuff1, upBufft);</div>
<div>@@ -488,7 +490,7 @@</div>
<div> {</div>
<div> int8_t iSignDown2 = signOf(rec[stride + startX] - tmpL[y]);</div>
<div> </div>
<div>- primitives.saoCuOrgE2[endX > 16](rec + startX, upBufft + startX, upBuff1 + startX, m_offsetEo, endX - startX, stride);</div>
<div>+ primitives.saoCuOrgE2[endX > 16](rec + startX, upBufft + startX, upBuff1 + startX, offsetEo, endX - startX, stride);</div>
<div> </div>
<div> upBufft[startX] = iSignDown2;</div>
<div> </div>
<div>@@ -520,14 +522,14 @@</div>
<div> int8_t signDown = signOf(rec[x] - tmpL[y + 1]);</div>
<div> int edgeType = signDown + upBuff1[x] + 2;</div>
<div> upBuff1[x - 1] = -signDown;</div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> </div>
<div> for (x = startX + 1; x < endX; x++)</div>
<div> {</div>
<div> signDown = signOf(rec[x] - rec[x + stride - 1]);</div>
<div> edgeType = signDown + upBuff1[x] + 2;</div>
<div> upBuff1[x - 1] = -signDown;</div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> }</div>
<div> </div>
<div> upBuff1[endX - 1] = signOf(rec[endX - 1 + stride] - rec[endX]);</div>
<div>@@ -557,9 +559,9 @@</div>
<div> int8_t signDown = signOf(rec[x] - tmpL[y + 1]);</div>
<div> int edgeType = signDown + upBuff1[x] + 2;</div>
<div> upBuff1[x - 1] = -signDown;</div>
<div>- rec[x] = m_clipTable[rec[x] + m_offsetEo[edgeType]];</div>
<div>+ rec[x] = m_clipTable[rec[x] + offsetEo[edgeType]];</div>
<div> </div>
<div>- primitives.saoCuOrgE3[endX > 16](rec, upBuff1, m_offsetEo, stride - 1, startX, endX);</div>
<div>+ primitives.saoCuOrgE3[endX > 16](rec, upBuff1, offsetEo, stride - 1, startX, endX);</div>
<div> </div>
<div> upBuff1[endX - 1] = signOf(rec[endX - 1 + stride] - rec[endX]);</div>
<div> </div>
<div>@@ -571,7 +573,7 @@</div>
<div> }</div>
<div> case SAO_BO:</div>
<div> {</div>
<div>- const int8_t* offsetBo = m_offsetBo;</div>
<div>+ const int8_t* offsetBo = m_offsetBo[plane];</div>
<div> </div>
<div> if (ctuWidth & 15)</div>
<div> {</div>
<div>@@ -649,10 +651,10 @@</div>
<div> {</div>
<div> if (typeIdx == SAO_BO)</div>
<div> {</div>
<div>- memset(m_offsetBo, 0, sizeof(m_offsetBo));</div>
<div>+ memset(m_offsetBo[plane], 0, sizeof(m_offsetBo[0]));</div>
<div> </div>
<div> for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>- m_offsetBo[((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div>+ m_offsetBo[plane][((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div> }</div>
<div> else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div> {</div>
<div>@@ -662,7 +664,7 @@</div>
<div> offset[i + 1] = ctuParam[addr].offset[i] << SAO_BIT_INC;</div>
<div> </div>
<div> for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>- m_offsetEo[edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ m_offsetEo[plane][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div> }</div>
<div> }</div>
<div> processSaoCu(addr, typeIdx, plane);</div>
<div>@@ -718,10 +720,10 @@</div>
<div> {</div>
<div> if (typeIdx == SAO_BO)</div>
<div> {</div>
<div>- memset(m_offsetBo, 0, sizeof(m_offsetBo));</div>
<div>+ memset(m_offsetBo[plane], 0, sizeof(m_offsetBo[0]));</div>
<div> </div>
<div> for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>- m_offsetBo[((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div>+ m_offsetBo[plane][((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div> }</div>
<div> else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div> {</div>
<div>@@ -731,7 +733,7 @@</div>
<div> offset[i + 1] = ctuParam[addr].offset[i] << SAO_BIT_INC;</div>
<div> </div>
<div> for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>- m_offsetEo[edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ m_offsetEo[plane][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div> }</div>
<div> }</div>
<div> processSaoCu(addr, typeIdx, plane);</div>
<div>diff -r b1c261378db2 -r a3a9660c91b8 source/encoder/sao.h</div>
<div>--- a/source/encoder/sao.h Wed Dec 02 11:12:54 2015 -0600</div>
<div>+++ b/source/encoder/sao.h Wed Dec 02 11:12:57 2015 -0600</div>
<div>@@ -80,8 +80,8 @@</div>
<div> PerPlane* m_offsetOrgPreDblk;</div>
<div> </div>
<div> double m_depthSaoRate[2][4];</div>
<div>- int8_t m_offsetBo[SAO_NUM_BO_CLASSES];</div>
<div>- int8_t m_offsetEo[NUM_EDGETYPE];</div>
<div>+ int8_t m_offsetBo[NUM_PLANE][SAO_NUM_BO_CLASSES];</div>
<div>+ int8_t m_offsetEo[NUM_PLANE][NUM_EDGETYPE];</div>
<div> </div>
<div> int m_chromaFormat;</div>
<div> int m_numCuInWidth;</div>
<div> </div>
<div> </div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>Message: 4</div>
<div>Date: Wed, 02 Dec 2015 11:28:37 -0600</div>
<div>From: Min Chen <chenm003@163.com></div>
<div>To: x265-devel@videolan.org</div>
<div>Subject: [x265] [PATCH 14 of 15] sao: reduce address operators by</div>
<div> split into Luma and Chroma path</div>
<div>Message-ID: <a6d88a08af3d48cb804a.1449077317@chen-PC></div>
<div>Content-Type: text/plain; charset="us-ascii"</div>
<div> </div>
<div># HG changeset patch</div>
<div># User Min Chen <chenm003@163.com></div>
<div># Date 1449076380 21600</div>
<div># Node ID a6d88a08af3d48cb804aa61819bd45ee685d1f59</div>
<div># Parent a3a9660c91b8eeb8f70869fc4022f939c01023f0</div>
<div>sao: reduce address operators by split into Luma and Chroma path</div>
<div>---</div>
<div> source/encoder/framefilter.cpp | 7 +--</div>
<div> source/encoder/sao.cpp | 133 ++++++++++++++++++++++++++++++++++------</div>
<div> source/encoder/sao.h | 3 +-</div>
<div> 3 files changed, 118 insertions(+), 25 deletions(-)</div>
<div> </div>
<div>diff -r a3a9660c91b8 -r a6d88a08af3d source/encoder/framefilter.cpp</div>
<div>--- a/source/encoder/framefilter.cpp Wed Dec 02 11:12:57 2015 -0600</div>
<div>+++ b/source/encoder/framefilter.cpp Wed Dec 02 11:13:00 2015 -0600</div>
<div>@@ -546,13 +546,10 @@</div>
<div> for(uint32_t col = 0; col < numCols; col++)</div>
<div> {</div>
<div> if (saoParam->bSaoFlag[0])</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[0], row, col, 0);</div>
<div>+ m_parallelFilter[row].m_sao.processSaoUnitCuLuma(saoParam->ctuParam[0], row, col);</div>
<div> </div>
<div> if (saoParam->bSaoFlag[1])</div>
<div>- {</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[1], row, col, 1);</div>
<div>- m_parallelFilter[row].m_sao.processSaoUnitCu(saoParam->ctuParam[2], row, col, 2);</div>
<div>- }</div>
<div>+ m_parallelFilter[row].m_sao.processSaoUnitCuChroma(saoParam->ctuParam, row, col);</div>
<div> }</div>
<div> </div>
<div> if (encData.m_slice->m_pps->bTransquantBypassEnabled)</div>
<div>diff -r a3a9660c91b8 -r a6d88a08af3d source/encoder/sao.cpp</div>
<div>--- a/source/encoder/sao.cpp Wed Dec 02 11:12:57 2015 -0600</div>
<div>+++ b/source/encoder/sao.cpp Wed Dec 02 11:13:00 2015 -0600</div>
<div>@@ -674,29 +674,21 @@</div>
<div> }</div>
<div> </div>
<div> /* Process SAO unit */</div>
<div>-void SAO::processSaoUnitCu(SaoCtuParam* ctuParam, int idxY, int idxX, int plane)</div>
<div>+void SAO::processSaoUnitCuLuma(SaoCtuParam* ctuParam, int idxY, int idxX)</div>
<div> {</div>
<div> PicYuv* reconPic = m_frame->m_reconPic;</div>
<div>- intptr_t stride = plane ? reconPic->m_strideC : reconPic->m_stride;</div>
<div>- uint32_t picWidth = m_param->sourceWidth;</div>
<div>+ intptr_t stride = reconPic->m_stride;</div>
<div> int ctuWidth = g_maxCUSize;</div>
<div> int ctuHeight = g_maxCUSize;</div>
<div> </div>
<div>- if (plane)</div>
<div>- {</div>
<div>- picWidth >>= m_hChromaShift;</div>
<div>- ctuWidth >>= m_hChromaShift;</div>
<div>- ctuHeight >>= m_vChromaShift;</div>
<div>- }</div>
<div>-</div>
<div> int addr = idxY * m_numCuInWidth + idxX;</div>
<div>- pixel* rec = reconPic->getPlaneAddr(plane, addr);</div>
<div>+ pixel* rec = reconPic->getLumaAddr(addr);</div>
<div> </div>
<div> if (idxX == 0)</div>
<div> {</div>
<div> for (int i = 0; i < ctuHeight + 1; i++)</div>
<div> {</div>
<div>- m_tmpL1[plane][i] = rec[0];</div>
<div>+ m_tmpL1[0][i] = rec[0];</div>
<div> rec += stride;</div>
<div> }</div>
<div> }</div>
<div>@@ -706,10 +698,10 @@</div>
<div> </div>
<div> if (idxX != (m_numCuInWidth - 1))</div>
<div> {</div>
<div>- rec = reconPic->getPlaneAddr(plane, addr);</div>
<div>+ rec = reconPic->getLumaAddr(addr);</div>
<div> for (int i = 0; i < ctuHeight + 1; i++)</div>
<div> {</div>
<div>- m_tmpL2[plane][i] = rec[ctuWidth - 1];</div>
<div>+ m_tmpL2[0][i] = rec[ctuWidth - 1];</div>
<div> rec += stride;</div>
<div> }</div>
<div> }</div>
<div>@@ -720,10 +712,10 @@</div>
<div> {</div>
<div> if (typeIdx == SAO_BO)</div>
<div> {</div>
<div>- memset(m_offsetBo[plane], 0, sizeof(m_offsetBo[0]));</div>
<div>+ memset(m_offsetBo[0], 0, sizeof(m_offsetBo[0]));</div>
<div> </div>
<div> for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>- m_offsetBo[plane][((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div>+ m_offsetBo[0][((ctuParam[addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[addr].offset[i] << SAO_BIT_INC);</div>
<div> }</div>
<div> else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div> {</div>
<div>@@ -733,12 +725,115 @@</div>
<div> offset[i + 1] = ctuParam[addr].offset[i] << SAO_BIT_INC;</div>
<div> </div>
<div> for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>- m_offsetEo[plane][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ m_offsetEo[0][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div> }</div>
<div> }</div>
<div>- processSaoCu(addr, typeIdx, plane);</div>
<div>+ processSaoCu(addr, typeIdx, 0);</div>
<div> }</div>
<div>- std::swap(m_tmpL1[plane], m_tmpL2[plane]);</div>
<div>+ std::swap(m_tmpL1[0], m_tmpL2[0]);</div>
<div>+}</div>
<div>+</div>
<div>+/* Process SAO unit (Chroma only) */</div>
<div>+void SAO::processSaoUnitCuChroma(SaoCtuParam* ctuParam[3], int idxY, int idxX)</div>
<div>+{</div>
<div>+ PicYuv* reconPic = m_frame->m_reconPic;</div>
<div>+ intptr_t stride = reconPic->m_strideC;</div>
<div>+ int ctuWidth = g_maxCUSize;</div>
<div>+ int ctuHeight = g_maxCUSize;</div>
<div>+</div>
<div>+ {</div>
<div>+ ctuWidth >>= m_hChromaShift;</div>
<div>+ ctuHeight >>= m_vChromaShift;</div>
<div>+ }</div>
<div>+</div>
<div>+ int addr = idxY * m_numCuInWidth + idxX;</div>
<div>+ pixel* recCb = reconPic->getCbAddr(addr);</div>
<div>+ pixel* recCr = reconPic->getCrAddr(addr);</div>
<div>+</div>
<div>+ if (idxX == 0)</div>
<div>+ {</div>
<div>+ for (int i = 0; i < ctuHeight + 1; i++)</div>
<div>+ {</div>
<div>+ m_tmpL1[1][i] = recCb[0];</div>
<div>+ m_tmpL1[2][i] = recCr[0];</div>
<div>+ recCb += stride;</div>
<div>+ recCr += stride;</div>
<div>+ }</div>
<div>+ }</div>
<div>+</div>
<div>+ bool mergeLeftFlagCb = (ctuParam[1][addr].mergeMode == SAO_MERGE_LEFT);</div>
<div>+ int typeIdxCb = ctuParam[1][addr].typeIdx;</div>
<div>+</div>
<div>+ bool mergeLeftFlagCr = (ctuParam[2][addr].mergeMode == SAO_MERGE_LEFT);</div>
<div>+ int typeIdxCr = ctuParam[2][addr].typeIdx;</div>
<div>+</div>
<div>+ if (idxX != (m_numCuInWidth - 1))</div>
<div>+ {</div>
<div>+ recCb = reconPic->getCbAddr(addr);</div>
<div>+ recCr = reconPic->getCrAddr(addr);</div>
<div>+ for (int i = 0; i < ctuHeight + 1; i++)</div>
<div>+ {</div>
<div>+ m_tmpL2[1][i] = recCb[ctuWidth - 1];</div>
<div>+ m_tmpL2[2][i] = recCr[ctuWidth - 1];</div>
<div>+ recCb += stride;</div>
<div>+ recCr += stride;</div>
<div>+ }</div>
<div>+ }</div>
<div>+</div>
<div>+ // Process U</div>
<div>+ if (typeIdxCb >= 0)</div>
<div>+ {</div>
<div>+ if (!mergeLeftFlagCb)</div>
<div>+ {</div>
<div>+ if (typeIdxCb == SAO_BO)</div>
<div>+ {</div>
<div>+ memset(m_offsetBo[1], 0, sizeof(m_offsetBo[0]));</div>
<div>+</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ m_offsetBo[1][((ctuParam[1][addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[1][addr].offset[i] << SAO_BIT_INC);</div>
<div>+ }</div>
<div>+ else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div>+ {</div>
<div>+ int offset[NUM_EDGETYPE];</div>
<div>+ offset[0] = 0;</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ offset[i + 1] = ctuParam[1][addr].offset[i] << SAO_BIT_INC;</div>
<div>+</div>
<div>+ for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>+ m_offsetEo[1][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ }</div>
<div>+ }</div>
<div>+ processSaoCu(addr, typeIdxCb, 1);</div>
<div>+ }</div>
<div>+</div>
<div>+ // Process V</div>
<div>+ if (typeIdxCr >= 0)</div>
<div>+ {</div>
<div>+ if (!mergeLeftFlagCr)</div>
<div>+ {</div>
<div>+ if (typeIdxCr == SAO_BO)</div>
<div>+ {</div>
<div>+ memset(m_offsetBo[2], 0, sizeof(m_offsetBo[0]));</div>
<div>+</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ m_offsetBo[2][((ctuParam[2][addr].bandPos + i) & (SAO_NUM_BO_CLASSES - 1))] = (int8_t)(ctuParam[2][addr].offset[i] << SAO_BIT_INC);</div>
<div>+ }</div>
<div>+ else // if (typeIdx == SAO_EO_0 || typeIdx == SAO_EO_1 || typeIdx == SAO_EO_2 || typeIdx == SAO_EO_3)</div>
<div>+ {</div>
<div>+ int offset[NUM_EDGETYPE];</div>
<div>+ offset[0] = 0;</div>
<div>+ for (int i = 0; i < SAO_NUM_OFFSET; i++)</div>
<div>+ offset[i + 1] = ctuParam[2][addr].offset[i] << SAO_BIT_INC;</div>
<div>+</div>
<div>+ for (int edgeType = 0; edgeType < NUM_EDGETYPE; edgeType++)</div>
<div>+ m_offsetEo[2][edgeType] = (int8_t)offset[s_eoTable[edgeType]];</div>
<div>+ }</div>
<div>+ }</div>
<div>+ processSaoCu(addr, typeIdxCb, 2);</div>
<div>+ }</div>
<div>+</div>
<div>+ std::swap(m_tmpL1[1], m_tmpL2[1]);</div>
<div>+ std::swap(m_tmpL1[2], m_tmpL2[2]);</div>
<div> }</div>
<div> </div>
<div> void SAO::copySaoUnit(SaoCtuParam* saoUnitDst, const SaoCtuParam* saoUnitSrc)</div>
<div>diff -r a3a9660c91b8 -r a6d88a08af3d source/encoder/sao.h</div>
<div>--- a/source/encoder/sao.h Wed Dec 02 11:12:57 2015 -0600</div>
<div>+++ b/source/encoder/sao.h Wed Dec 02 11:13:00 2015 -0600</div>
<div>@@ -132,7 +132,8 @@</div>
<div> // CTU-based SAO process without slice granularity</div>
<div> void processSaoCu(int addr, int typeIdx, int plane);</div>
<div> void processSaoUnitRow(SaoCtuParam* ctuParam, int idxY, int plane);</div>
<div>- void processSaoUnitCu(SaoCtuParam* ctuParam, int idxY, int idxX, int plane);</div>
<div>+ void processSaoUnitCuLuma(SaoCtuParam* ctuParam, int idxY, int idxX);</div>
<div>+ void processSaoUnitCuChroma(SaoCtuParam* ctuParam[3], int idxY, int idxX);</div>
<div> </div>
<div> void copySaoUnit(SaoCtuParam* saoUnitDst, const SaoCtuParam* saoUnitSrc);</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>Subject: Digest Footer</div>
<div> </div>
<div>_______________________________________________</div>
<div>x265-devel mailing list</div>
<div>x265-devel@videolan.org</div>
<div>https://mailman.videolan.org/listinfo/x265-devel</div>
<div> </div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>End of x265-devel Digest, Vol 31, Issue 7</div>
<div>*****************************************</div>
</div></blockquote>
</body></html>