[x265] [PATCH 00/11] AArch64: Add Neon and SVE asm impl. of HBD SSE/SSD

chen chenm003 at 163.com
Wed Dec 11 11:50:26 UTC 2024


Thank for the patches, I have some comments




* In current version, we support pixel up to 12 bits, so sse_pp equal to sse_ss, of course, separate 16-bits version is not bad idea.

* In below code, LD1 vs LDR, which one better?
+    ld1             {v16.8h-v17.8h}, [x0], x1
+    ld1             {v18.8h-v19.8h}, [x2], x3





At 2024-12-10 23:59:15, "Gerda Zsejke More" <gerdazsejke.more at arm.com> wrote:
>Hi,
>
>This patch series adds Neon and SVE asm implementation of HBD SSE_PP, SSE_SS and SSD_S functions.
>The added HBD SSE_SS and SSD_S SVE implementation is suitable for SBD as well, so enable it for that.
>Delete unused Neon intrinsics functions for SSE and SSD_S.
>
>This series is based on the master branch.
>
>Many thanks,
>Gerda
>
>Gerda Zsejke More (11):
>  Avoid aliasing HBD SSE_PP functions for AArch64 platforms
>  AArch64: Add Neon asm implementation of HBD SSE_PP
>  AArch64: Add SVE asm implementation of HBD SSE_PP
>  AArch64: Add Neon asm implementation of HBD SSE_SS
>  AArch64: Add SVE asm implementation of HBD SSE_SS
>  AArch64: Enable existing SSE_SS SVE impl for SBD
>  AArch64: Delete sse_neon implementation
>  AArch64: Add Neon asm implementation of HBD SSD_S
>  AArch64: Add SVE asm implementation of HBD SSD_S
>  AArch64: Enable existing SSD_S SVE impl for SBD
>  AArch64: Delete pixel_ssd_s_neon implementation
>
> source/common/CMakeLists.txt             |   4 +-
> source/common/aarch64/asm-primitives.cpp |  84 +--
> source/common/aarch64/pixel-prim.cpp     |  89 ----
> source/common/aarch64/ssd-a-sve.S        | 483 +++++++++++++++++
> source/common/aarch64/ssd-a-sve2.S       | 626 -----------------------
> source/common/aarch64/ssd-a.S            | 525 +++++++++++++++++++
> source/common/primitives.cpp             |   2 +
> 7 files changed, 1063 insertions(+), 750 deletions(-)
> create mode 100644 source/common/aarch64/ssd-a-sve.S
> delete mode 100644 source/common/aarch64/ssd-a-sve2.S
>
>-- 
>2.39.5 (Apple Git-154)
>
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20241211/3b7cccaf/attachment.htm>


More information about the x265-devel mailing list