[x265] [PATCH 0/7] AArch64 saoCuStats Optimisations
Hari Limaye
hari.limaye at arm.com
Mon May 20 16:14:35 UTC 2024
Hi,
This patch-series adds AArch64 Neon, SVE, and SVE2 implementations of
the saoCuStats function primitives for low and high bitdepth.
This series is based on the previously submitted refactoring patch
series.
Performance numbers:
C -> Neon on Neoverse V1:
Low bitdepth:
saoCuStatsBO | 1.09x
saoCuStatsE0 | 2.67x
saoCuStatsE1 | 2.82x
saoCuStatsE2 | 2.93x
saoCuStatsE3 | 3.26x
High bitdepth:
saoCuStatsBO | 1.09x
saoCuStatsE0 | 2.39x
saoCuStatsE1 | 2.67x
saoCuStatsE2 | 2.47x
saoCuStatsE3 | 2.86x
Neon -> SVE on Neoverse V1:
Low bitdepth:
saoCuStatsE0 | 1.12x
saoCuStatsE1 | 1.15x
saoCuStatsE2 | 1.21x
saoCuStatsE3 | 1.14x
High bitdepth:
saoCuStatsE0 | 1.19x
saoCuStatsE1 | 1.28x
saoCuStatsE2 | 1.19x
saoCuStatsE3 | 1.12x
SVE -> SVE2 on Neoverse V2:
Low bitdepth:
saoCuStatsE0 | 1.08x
saoCuStatsE1 | 1.06x
saoCuStatsE2 | 1.06x
saoCuStatsE3 | 1.09x
High bitdepth:
saoCuStatsE0 | 1.03x
saoCuStatsE1 | 1.10x
saoCuStatsE2 | 1.08x
saoCuStatsE3 | 1.09x
Many thanks,
Hari
Hari Limaye (7):
Test: Relax constraints of check_saoCuStatsE*
Move duplicated signOf function to common header
AArch64: Add Neon saoCuStats primitives for low bitdepth
AArch64: Add Neon saoCuStats primitives for high bitdepth
AArch64: Add check for arm_neon_sve_bridge.h
AArch64: Add SVE saoCuStats primitives
AArch64: Add SVE2 saoCuStats primitives
source/CMakeLists.txt | 35 +-
source/common/CMakeLists.txt | 19 +-
source/common/aarch64/asm-primitives.cpp | 14 +
source/common/aarch64/loopfilter-prim.cpp | 19 +-
source/common/aarch64/sao-prim-sve.cpp | 271 +++++++++++++++
source/common/aarch64/sao-prim-sve2.cpp | 317 ++++++++++++++++++
source/common/aarch64/sao-prim.cpp | 380 ++++++++++++++++++++++
source/common/aarch64/sao-prim.h | 100 ++++++
source/common/common.h | 6 +
source/common/loopfilter.cpp | 16 +-
source/encoder/sao.cpp | 74 ++---
source/test/pixelharness.cpp | 11 +-
12 files changed, 1187 insertions(+), 75 deletions(-)
create mode 100644 source/common/aarch64/sao-prim-sve.cpp
create mode 100644 source/common/aarch64/sao-prim-sve2.cpp
create mode 100644 source/common/aarch64/sao-prim.cpp
create mode 100644 source/common/aarch64/sao-prim.h
--
2.42.1
More information about the x265-devel
mailing list