<html><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:12px"><div id="yui_3_16_0_1_1418336675773_2993"><span></span></div><div dir="ltr" id="yui_3_16_0_1_1418336675773_3149">one question: how does sa8d compare to satd(in terms of functionality), and why is it called sa8d?</div><div dir="ltr" id="yui_3_16_0_1_1418336675773_3150"><br></div><div dir="ltr" id="yui_3_16_0_1_1418336675773_3152">Thanks</div><div dir="ltr" id="yui_3_16_0_1_1418336675773_3153">--Chekib</div>  <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 12px;" id="yui_3_16_0_1_1418336675773_2997"> <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 16px;" id="yui_3_16_0_1_1418336675773_2996"> <div dir="ltr" id="yui_3_16_0_1_1418336675773_2995"> <hr size="1" id="yui_3_16_0_1_1418336675773_2994">  <font size="2" face="Arial" id="yui_3_16_0_1_1418336675773_2998"> <b><span style="font-weight:bold;">De :</span></b> Steve Borho <steve@borho.org><br> <b><span style="font-weight: bold;">À :</span></b> Development for x265 <x265-devel@videolan.org> <br> <b><span style="font-weight: bold;">Envoyé le :</span></b> Jeudi 11 décembre 2014 8h40<br> <b id="yui_3_16_0_1_1418336675773_3169"><span style="font-weight: bold;" id="yui_3_16_0_1_1418336675773_3168">Objet :</span></b> Re: [x265] [PATCH] integrate assembly code for psyCost_pp<br> </font> </div> <div class="y_msg_container" id="yui_3_16_0_1_1418336675773_3170"><br>On 12/11, Divya Manivannan wrote:<br clear="none">> # HG changeset patch<br clear="none">> # User Divya Manivannan <<a shape="rect" ymailto="mailto:divya@multicorewareinc.com" href="mailto:divya@multicorewareinc.com">divya@multicorewareinc.com</a>><br clear="none">> # Date 1418296477 -19800<br clear="none">> #      Thu Dec 11 16:44:37 2014 +0530<br clear="none">> # Node ID 440d264fcdf33889b665848f19e87ca3559d1b6c<br clear="none">> # Parent  667e4ea0899fcf026ee9df935381487d3148ed0c<br clear="none">> integrate assembly code for psyCost_pp<br clear="none">> <br clear="none">> diff -r 667e4ea0899f -r 440d264fcdf3 source/common/pixel.cpp<br clear="none">> --- a/source/common/pixel.cpp    Thu Dec 11 09:36:16 2014 +0530<br clear="none">> +++ b/source/common/pixel.cpp    Thu Dec 11 16:44:37 2014 +0530<br clear="none">> @@ -815,10 +815,11 @@<br clear="none">>              for (int j = 0; j < dim; j+= 8)<br clear="none">>              {<br clear="none">>                  /* AC energy, measured by sa8d (AC + DC) minus SAD (DC) */<br clear="none">> -                int sourceEnergy = sa8d_8x8(source + i * sstride + j, sstride, zeroBuf, 0) - <br clear="none">> -                                   (sad<8, 8>(source + i * sstride + j, sstride, zeroBuf, 0) >> 2);<br clear="none">> -                int reconEnergy =  sa8d_8x8(recon + i * rstride + j, rstride, zeroBuf, 0) - <br clear="none">> -                                   (sad<8, 8>(recon + i * rstride + j, rstride, zeroBuf, 0) >> 2);<br clear="none">> +                // PartitionFromSizes(8, 8) = 1<br clear="none">> +                int sourceEnergy = primitives.sa8d[1](source + i * sstride + j, sstride, zeroBuf, 0) -<br clear="none">> +                                   (primitives.sad[1](source + i * sstride + j, sstride, zeroBuf, 0) >> 2);<br clear="none">> +                int reconEnergy = primitives.sa8d[1](recon + i * rstride + j, rstride, zeroBuf, 0) -<br clear="none">> +                                  (primitives.sad[1](recon + i * rstride + j, rstride, zeroBuf, 0) >> 2);<br clear="none"><br clear="none">This is an improvement over just C code, but it is still vastly slower<br clear="none">than writing new assembly functions for these. The function call<br clear="none">overhead is non-trivial.<br clear="none"><br clear="none">>  <br clear="none">>                  totEnergy += abs(sourceEnergy - reconEnergy);<br clear="none">>              }<br clear="none">> @@ -828,8 +829,11 @@<br clear="none">>      else<br clear="none">>      {<br clear="none">>          /* 4x4 is too small for sa8d */<br clear="none">> -        int sourceEnergy = satd_4x4(source, sstride, zeroBuf, 0) - (sad<4, 4>(source, sstride, zeroBuf, 0) >> 2);<br clear="none">> -        int reconEnergy = satd_4x4(recon, rstride, zeroBuf, 0) - (sad<4, 4>(recon, rstride, zeroBuf, 0) >> 2);<br clear="none">> +        // partitionFromSizes(4, 4) = 0<br clear="none">> +        int sourceEnergy = primitives.satd[0](source, sstride, zeroBuf, 0) -<br clear="none">> +                           (primitives.sad[0](source, sstride, zeroBuf, 0) >> 2);<br clear="none">> +        int reconEnergy = primitives.satd[0](recon, rstride, zeroBuf, 0) -<br clear="none">> +                          (primitives.sad[0](recon, rstride, zeroBuf, 0) >> 2);<br clear="none">>          return abs(sourceEnergy - reconEnergy);<br clear="none">>      }<br clear="none">>  }<br clear="none">> _______________________________________________<br clear="none">> x265-devel mailing list<br clear="none">> <a shape="rect" ymailto="mailto:x265-devel@videolan.org" href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br clear="none">> <a shape="rect" href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br clear="none"><br clear="none">-- <br clear="none">Steve Borho<div class="qtdSeparateBR"><br><br></div><div class="yqt5689238866" id="yqtfd40513"><br clear="none">_______________________________________________<br clear="none">x265-devel mailing list<br clear="none"><a shape="rect" ymailto="mailto:x265-devel@videolan.org" href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br clear="none"><a shape="rect" href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br clear="none"></div><br><br></div> </div> </div>  </div></body></html>