[x265] [PATCH 00/11] AArch64: Add Neon and SVE asm impl. of HBD SSE/SSD
Gerda Zsejke More
gerdazsejke.more at arm.com
Tue Dec 10 15:59:15 UTC 2024
Hi,
This patch series adds Neon and SVE asm implementation of HBD SSE_PP, SSE_SS and SSD_S functions.
The added HBD SSE_SS and SSD_S SVE implementation is suitable for SBD as well, so enable it for that.
Delete unused Neon intrinsics functions for SSE and SSD_S.
This series is based on the master branch.
Many thanks,
Gerda
Gerda Zsejke More (11):
Avoid aliasing HBD SSE_PP functions for AArch64 platforms
AArch64: Add Neon asm implementation of HBD SSE_PP
AArch64: Add SVE asm implementation of HBD SSE_PP
AArch64: Add Neon asm implementation of HBD SSE_SS
AArch64: Add SVE asm implementation of HBD SSE_SS
AArch64: Enable existing SSE_SS SVE impl for SBD
AArch64: Delete sse_neon implementation
AArch64: Add Neon asm implementation of HBD SSD_S
AArch64: Add SVE asm implementation of HBD SSD_S
AArch64: Enable existing SSD_S SVE impl for SBD
AArch64: Delete pixel_ssd_s_neon implementation
source/common/CMakeLists.txt | 4 +-
source/common/aarch64/asm-primitives.cpp | 84 +--
source/common/aarch64/pixel-prim.cpp | 89 ----
source/common/aarch64/ssd-a-sve.S | 483 +++++++++++++++++
source/common/aarch64/ssd-a-sve2.S | 626 -----------------------
source/common/aarch64/ssd-a.S | 525 +++++++++++++++++++
source/common/primitives.cpp | 2 +
7 files changed, 1063 insertions(+), 750 deletions(-)
create mode 100644 source/common/aarch64/ssd-a-sve.S
delete mode 100644 source/common/aarch64/ssd-a-sve2.S
--
2.39.5 (Apple Git-154)
More information about the x265-devel
mailing list