[x265] [arm64] port costCoeffNxN

chen chenm003 at 163.com
Thu Mar 3 04:20:35 UTC 2022


Hi Sebastian,


Thank you for your contibution, the code looks good.


Just a little comment for future performance improve,
"fmov w12, s2" are expensive because data across Neon and Integer fields, especally it is inside the loop.
There are also some deep-seated data organization and algorithm problems, for example, we spends many instructions for absCoeff[numNonZero], if we allow spare zeros inside of array, we will reduce many of instructions.


Regards,
Min Chen




At 2022-03-02 07:28:15, "Pop, Sebastian" <spop at amazon.com> wrote:

Hi,





the attached patch fixes the registration of costCoeffNxN function hook and removes the early return that I used for testing.





Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20220303/79133212/attachment.html>


More information about the x265-devel mailing list