[x265] [PATCH 0/4] AArch64: Optimize and add pixel_var Implementations

Li Zhang li.zhang2 at arm.com
Tue Jun 17 18:22:25 UTC 2025


Hi,

This patch series optimizes the exisiting standard bit-depth pixel_var
Neon intrinsics implementation, deletes the slower assembly
implementation. It also adds Neon DotProd intrinsics implementation for
the standard bit-depth and Neon, SVE intrinsics implementations for the
high bit-depth of pixel_var function.

Many thanks,
Li

Li Zhang (4):
  AArch64: Optimize and clean up SBD pixel_var functions
  AArch64: Add HBD pixel_var Neon intrinscis implementations
  AArch64: Add SBD pixel_var Neon DotProd intrinsics implementations
  AArch64: Add HBD pixel_var SVE intrinsics implementations

 source/common/CMakeLists.txt                  |   4 +-
 source/common/aarch64/asm-primitives.cpp      |  14 +-
 source/common/aarch64/fun-decls.h             |  10 -
 source/common/aarch64/neon-sve-bridge.h       |   7 +
 .../aarch64/pixel-prim-neon-dotprod.cpp       | 111 ++++++++++
 source/common/aarch64/pixel-prim-sve.cpp      | 137 ++++++++++++
 source/common/aarch64/pixel-prim.cpp          | 197 +++++++++++++++---
 source/common/aarch64/pixel-prim.h            |   6 +
 source/common/aarch64/pixel-util-common.S     |  27 ---
 source/common/aarch64/pixel-util-sve2.S       | 195 -----------------
 source/common/aarch64/pixel-util.S            |  61 ------
 11 files changed, 434 insertions(+), 335 deletions(-)
 create mode 100644 source/common/aarch64/pixel-prim-neon-dotprod.cpp
 create mode 100644 source/common/aarch64/pixel-prim-sve.cpp

--
2.39.5 (Apple Git-154)



More information about the x265-devel mailing list