[x265] [PATCH 0/4] Optimise AArch64 quant/nquant primitives
Hari Limaye
hari.limaye at arm.com
Mon Aug 12 21:14:28 UTC 2024
Hi,
This series optimises the AArch64 Neon implementations of quant and
nquant primitives.
The series has no dependencies on any other on-list patch sets, except
for patch:
[PATCH 1/4] AArch64: Remove SVE assembly implementation of quant
which is based on refactoring changes to aarch64/asm-primitives.cpp made
in the SAD patch series:
https://mailman.videolan.org/pipermail/x265-devel/2024-May/013707.html
Relative performance observed compared to the existing Neon
implementation:
quant_neon:
Neoverse N1: 1.57x
Neoverse V1: 1.59x
Neoverse N2: 1.54x
Neoverse V2: 1.59x
nquant_neon:
Neoverse N1: 1.79x
Neoverse V1: 1.77x
Neoverse N2: 1.70x
Neoverse V2: 1.73x
Many thanks,
Hari
Hari Limaye (4):
AArch64: Remove SVE assembly implementation of quant
AArch64: Optimise quant_neon
Test: Update values used in check_nquant_primitive
AArch64: Optimise nquant_neon
source/common/aarch64/asm-primitives.cpp | 3 -
source/common/aarch64/fun-decls.h | 2 -
source/common/aarch64/pixel-util-sve.S | 57 -------------
source/common/aarch64/pixel-util.S | 103 ++++++++++++-----------
source/test/mbdstharness.cpp | 10 ++-
5 files changed, 61 insertions(+), 114 deletions(-)
--
2.42.1
More information about the x265-devel
mailing list