[x265] [arm64] port sad

Pop, Sebastian spop at amazon.com
Tue Jul 20 04:45:03 UTC 2021


Thanks Min Chen for your reviews.
I tried your suggestion to remove one of the FP->GPR transfers.
With the following patch I do not see any improvement for the 64x routines, and the number of instructions remains the same:

--- a/source/common/aarch64/sad-a.S
+++ b/source/common/aarch64/sad-a.S
@@ -137,14 +137,14 @@
     add             v16.8h, v16.8h, v17.8h
     add             v17.8h, v18.8h, v19.8h
     add             v16.8h, v16.8h, v17.8h
-    uaddlv          s0,  v16.8h
-    fmov            w0,  s0
+    uaddlp          v16.4s, v16.8h
     add             v18.8h, v20.8h, v21.8h
     add             v19.8h, v22.8h, v23.8h
     add             v17.8h, v18.8h, v19.8h
-    uaddlv          s1,  v17.8h
-    fmov            w1,  s1
-    add             w0, w0, w1
+    uaddlp          v17.4s, v17.8h
+    add             v16.4s, v16.4s, v17.4s
+    uaddlv          d0, v16.4s
+    fmov            x0, d0
     ret
.endm

Please see the amended patch with your recommended change.

Thanks,
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20210720/c6dd5cef/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-arm64-port-sad.patch
Type: application/octet-stream
Size: 16323 bytes
Desc: 0001-arm64-port-sad.patch
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20210720/c6dd5cef/attachment-0001.obj>


More information about the x265-devel mailing list