[x265] [PATCH 00/11] AArch64: Add Neon and SVE asm impl. of HBD SSE/SSD

Gerda Zsejke More gerdazsejke.more at arm.com
Tue Dec 10 15:59:15 UTC 2024


Hi,

This patch series adds Neon and SVE asm implementation of HBD SSE_PP, SSE_SS and SSD_S functions.
The added HBD SSE_SS and SSD_S SVE implementation is suitable for SBD as well, so enable it for that.
Delete unused Neon intrinsics functions for SSE and SSD_S.

This series is based on the master branch.

Many thanks,
Gerda

Gerda Zsejke More (11):
  Avoid aliasing HBD SSE_PP functions for AArch64 platforms
  AArch64: Add Neon asm implementation of HBD SSE_PP
  AArch64: Add SVE asm implementation of HBD SSE_PP
  AArch64: Add Neon asm implementation of HBD SSE_SS
  AArch64: Add SVE asm implementation of HBD SSE_SS
  AArch64: Enable existing SSE_SS SVE impl for SBD
  AArch64: Delete sse_neon implementation
  AArch64: Add Neon asm implementation of HBD SSD_S
  AArch64: Add SVE asm implementation of HBD SSD_S
  AArch64: Enable existing SSD_S SVE impl for SBD
  AArch64: Delete pixel_ssd_s_neon implementation

 source/common/CMakeLists.txt             |   4 +-
 source/common/aarch64/asm-primitives.cpp |  84 +--
 source/common/aarch64/pixel-prim.cpp     |  89 ----
 source/common/aarch64/ssd-a-sve.S        | 483 +++++++++++++++++
 source/common/aarch64/ssd-a-sve2.S       | 626 -----------------------
 source/common/aarch64/ssd-a.S            | 525 +++++++++++++++++++
 source/common/primitives.cpp             |   2 +
 7 files changed, 1063 insertions(+), 750 deletions(-)
 create mode 100644 source/common/aarch64/ssd-a-sve.S
 delete mode 100644 source/common/aarch64/ssd-a-sve2.S

-- 
2.39.5 (Apple Git-154)



More information about the x265-devel mailing list