[x265] [arm64] port scale1D_128to64 and scale2D_64to32

chen chenm003 at 163.com
Sat Jul 31 00:07:37 UTC 2021


Hi, 


The code looks good.
little performance change because pipeline stall, two of LD1 can't hidden latency penalty, but it is not big problem, we saved the code size.
Could you please make a stalone patch, I guess patch to patch is not good idea.


Regards,
Min Chen

At 2021-07-31 02:27:36, "Pop, Sebastian" <spop at amazon.com> wrote:

A small change to save a few bytes in code size.

I replaced the 4 LD1 2 regs with 2 LD1 4 regs.

No performance change.

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20210731/a257d703/attachment-0001.html>


More information about the x265-devel mailing list