<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><div><br><br><br><br><br></div><div></div><div id="divNeteaseMailCard"></div><div><br></div>At 2015-11-20 15:33:19,"Ashok Kumar Mishra" <ashok@multicorewareinc.com> wrote:<br> <blockquote id="isReplyContent" style="margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-left-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: solid;"><div dir="ltr">Yes, this idea is from HM, but our SAO module is HM code only, except some asm code.<div>Here the bottom and right line numbers to be skipped is initialized once in SAO:create() function before the start of encoding.</div><div>So, in calculating SAO statistics for every CTU, we are just reading those values from a array. </div><div><br></div><div>But in our existing SAO code which is the old HM version, we are repeatedly calculating the bottom and right line numbers to be skipped</div><div>for every CTU as well as for every plane. And also checking the condition below for every CTU for all components and for all EO and BO classes.</div><div><br></div><div><div> if (m_param->bSaoNonDeblocked)</div><div> {</div><div> skipB = 3;</div><div> skipR = 4;</div><div> }</div><div><br></div><div>In C code:</div><div> int skipB = 4;<br> int skipR = 5;</div><div><br></div><div> if (m_param->bSaoNonDeblocked)<br> {<br> skipB = 3;<br> skipR = 4;<br> }</div><div><br></div><div>in pseudo asm code:</div><div>mov R0, 4</div><div>mov R1, 5</div><div>cmp bSaoNonDeblocked</div><div>cmove R0, 3</div><div>cmove R0, 4</div><div><br></div><div>In Memory format, we need more instructions to calculate 64-bits address, and in asm developer, we know data range is [3,5] before, but need assume range [0,4G] since it read from memory.</div><div><br></div><div><br></div></div><div>I believe all those things can be avoided by initializing those values before the start of encoding and that is only once. So when we are calculating SAO</div><div>statistics for every CTU, we can just read those values from that array. <br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Nov 20, 2015 at 12:33 AM, chen <span dir="ltr"><<a href="mailto:chenm003@163.com" target="_blank">chenm003@163.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-left-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: solid;"><div style="color: rgb(0, 0, 0); line-height: 1.7; font-family: arial; font-size: 14px;"><div style="color: rgb(0, 0, 0); line-height: 1.7; font-family: arial; font-size: 14px;"><div><br></div><pre><div><div class="h5"><br>At 2015-11-19 18:03:09,<a href="mailto:ashok@multicorewareinc.com" target="_blank">ashok@multicorewareinc.com</a> wrote:
># HG changeset patch
># User Ashok Kumar Mishra<<a href="mailto:ashok@multicorewareinc.com" target="_blank">ashok@multicorewareinc.com</a>>
># Date 1446119115 -19800
># Thu Oct 29 17:15:15 2015 +0530
># Node ID 4a273947c8d54b4de3c05e0e04c9c915f554e6e5
># Parent f722fb55404bb80b26a55ba0a0a1b98d8f20b362
>SAO: initialize bottom and right line numbers to be skipped for SAO statistics calculation only once
>
>diff -r f722fb55404b -r 4a273947c8d5 source/encoder/frameencoder.cpp
>--- a/source/encoder/frameencoder.cpp Wed Nov 18 12:28:03 2015 +0530
>+++ b/source/encoder/frameencoder.cpp Thu Oct 29 17:15:15 2015 +0530
>@@ -1091,7 +1091,7 @@
>
> /* SAO parameter estimation using non-deblocked pixels for CTU bottom and right boundary areas */
> if (m_param->bEnableSAO && m_param->bSaoNonDeblocked)
>- m_frameFilter.m_sao.calcSaoStatsCu_BeforeDblk(m_frame, col, row);
>+ m_frameFilter.m_sao.calcPreDeblockSaoStatsCu(m_frame, col, row);
>
> if (m_param->bEnableWavefront && curRow.completed >= 2 && row < m_numRows - 1 &&
> (!m_bAllRowsStop || intRow + 1 < m_vbvResetTriggerRow))
>diff -r f722fb55404b -r 4a273947c8d5 source/encoder/sao.cpp
>--- a/source/encoder/sao.cpp Wed Nov 18 12:28:03 2015 +0530
>+++ b/source/encoder/sao.cpp Thu Oct 29 17:15:15 2015 +0530
>@@ -138,6 +138,68 @@
> CHECKED_MALLOC(m_countPreDblk, PerPlane, numCtu);
> CHECKED_MALLOC(m_offsetOrgPreDblk, PerPlane, numCtu);
>
>+ for (int typeIdc = 0; typeIdc < MAX_NUM_SAO_TYPE; typeIdc++)
>+ {
>+ m_skipLinesR[TEXT_LUMA][typeIdc] = 5;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 3;
>+
>+ m_skipLinesB[TEXT_LUMA ][typeIdc] = 4;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 2;
>+
>+ if (!m_param->bSaoNonDeblocked)
>+ {
>+ for (int typeIdc = 0; typeIdc < MAX_NUM_SAO_TYPE; typeIdc++)
>+ {
>+ m_skipLinesR[TEXT_LUMA][typeIdc] = 5;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 3;
>+
>+ m_skipLinesB[TEXT_LUMA ][typeIdc] = 4;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 2;
>+ }
>+ }
>+ else
>+ {
>+ for (int typeIdc = 0; typeIdc < MAX_NUM_SAO_TYPE; typeIdc++)
>+ {
>+ switch (typeIdc)
>+ {
>+ case SAO_EO_0:
>+ m_skipLinesR[TEXT_LUMA ][typeIdc] = 5;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 3;
>+
>+ m_skipLinesB[TEXT_LUMA ][typeIdc] = 3;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 1;
>+ break;
>+ case SAO_EO_1:
>+ m_skipLinesR[TEXT_LUMA][typeIdc] = 4;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 2;
>+
>+ m_skipLinesB[TEXT_LUMA][typeIdc] = 4;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 2;
>+ break;
>+ case SAO_EO_2:
>+ case SAO_EO_3:
>+ m_skipLinesR[TEXT_LUMA][typeIdc] = 5;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 3;
>+
>+ m_skipLinesB[TEXT_LUMA][typeIdc] = 4;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 2;
>+ break;
>+ case SAO_BO:
>+ m_skipLinesR[TEXT_LUMA][typeIdc] = 4;
>+ m_skipLinesR[TEXT_CHROMA_U][typeIdc] = m_skipLinesR[TEXT_CHROMA_V][typeIdc] = 2;
>+
>+ m_skipLinesB[TEXT_LUMA][typeIdc] = 3;
>+ m_skipLinesB[TEXT_CHROMA_U][typeIdc] = m_skipLinesB[TEXT_CHROMA_V][typeIdc] = 1;
>+ break;
>+ default:
>+ X265_CHECK(0, "Not a supported </div></div>type");
>+ break;
>+ }
>+ }
>+ }
>+ }
>+
this idea from HM, it more clear but more memory operators and compiler difficult to optimize code, e.g. MOV+CMP+CMOV replace by series address calculate and memory loading operator</pre><pre><br></pre></div></div><br>_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank" rel="noreferrer">https://mailman.videolan.org/listinfo/x265-devel</a><br>
<br></blockquote></div><br></div>
</blockquote></div>