[x265] Some more Arm64 patches to bring performance up on Graviton processors
chen
chenm003 at 163.com
Fri Mar 25 07:20:48 UTC 2022
Hello,
a little comments
+function PFX(cpy2Dto1D_shl_64x64_neon)
+ cpy2Dto1D_shl_start
+ mov w12, #32
+.loop_cpy2Dto1D_shl_64:
+ sub w12, w12, #1
+.rept 2
+ ldp q2, q3, [x1]
+ ldp q4, q5, [x1, #32]
[MC] Why not LD1? same as STP
-#if X86_64
+#if X86_64 || defined(__aarch64__)
[MC] This is right, but for more generic, we can check with sizeof(long*)==8
Other are fine.
Regards,
Min Chen
2022-03-25 00:24:01,"Pop, Sebastian" <spop at amazon.com>
Hi,
Please find attached a few more changes that bring up the performance of x265 on Arm64 processors.
Patches tested on Graviton2 aarch64-linux.
Ok to commit?
Thanks,
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20220325/8a772391/attachment.html>
More information about the x265-devel
mailing list