[x265] [PATCH 0/4] AArch64: Optimize and add pixel_var Implementations
Li Zhang
li.zhang2 at arm.com
Tue Jun 17 18:22:25 UTC 2025
Hi,
This patch series optimizes the exisiting standard bit-depth pixel_var
Neon intrinsics implementation, deletes the slower assembly
implementation. It also adds Neon DotProd intrinsics implementation for
the standard bit-depth and Neon, SVE intrinsics implementations for the
high bit-depth of pixel_var function.
Many thanks,
Li
Li Zhang (4):
AArch64: Optimize and clean up SBD pixel_var functions
AArch64: Add HBD pixel_var Neon intrinscis implementations
AArch64: Add SBD pixel_var Neon DotProd intrinsics implementations
AArch64: Add HBD pixel_var SVE intrinsics implementations
source/common/CMakeLists.txt | 4 +-
source/common/aarch64/asm-primitives.cpp | 14 +-
source/common/aarch64/fun-decls.h | 10 -
source/common/aarch64/neon-sve-bridge.h | 7 +
.../aarch64/pixel-prim-neon-dotprod.cpp | 111 ++++++++++
source/common/aarch64/pixel-prim-sve.cpp | 137 ++++++++++++
source/common/aarch64/pixel-prim.cpp | 197 +++++++++++++++---
source/common/aarch64/pixel-prim.h | 6 +
source/common/aarch64/pixel-util-common.S | 27 ---
source/common/aarch64/pixel-util-sve2.S | 195 -----------------
source/common/aarch64/pixel-util.S | 61 ------
11 files changed, 434 insertions(+), 335 deletions(-)
create mode 100644 source/common/aarch64/pixel-prim-neon-dotprod.cpp
create mode 100644 source/common/aarch64/pixel-prim-sve.cpp
--
2.39.5 (Apple Git-154)
More information about the x265-devel
mailing list