[x265] [arm64] port costCoeffNxN
chen
chenm003 at 163.com
Thu Mar 3 04:20:35 UTC 2022
Hi Sebastian,
Thank you for your contibution, the code looks good.
Just a little comment for future performance improve,
"fmov w12, s2" are expensive because data across Neon and Integer fields, especally it is inside the loop.
There are also some deep-seated data organization and algorithm problems, for example, we spends many instructions for absCoeff[numNonZero], if we allow spare zeros inside of array, we will reduce many of instructions.
Regards,
Min Chen
At 2022-03-02 07:28:15, "Pop, Sebastian" <spop at amazon.com> wrote:
Hi,
the attached patch fixes the registration of costCoeffNxN function hook and removes the early return that I used for testing.
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20220303/79133212/attachment.html>
More information about the x265-devel
mailing list