[x265] [arm64] port scale1D_128to64 and scale2D_64to32
chen
chenm003 at 163.com
Sat Jul 31 00:07:37 UTC 2021
Hi,
The code looks good.
little performance change because pipeline stall, two of LD1 can't hidden latency penalty, but it is not big problem, we saved the code size.
Could you please make a stalone patch, I guess patch to patch is not good idea.
Regards,
Min Chen
At 2021-07-31 02:27:36, "Pop, Sebastian" <spop at amazon.com> wrote:
A small change to save a few bytes in code size.
I replaced the 4 LD1 2 regs with 2 LD1 4 regs.
No performance change.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20210731/a257d703/attachment-0001.html>
More information about the x265-devel
mailing list