[x265] [PATCH 0/6] AArch64: Fix SVE(2) kernels for vectors >128 bits
George Steed
george.steed at arm.com
Mon Jan 6 17:16:04 UTC 2025
Hi,
There are existing SVE and SVE2 assembly implementations of some kernels
that inspect the SVE vector length to determine which kernel loop to
execute, however some of these have issues when running on vector
lengths other than 128 bits.
This patch series fixes all issues that previously failed in the
TestBench executable when running with vector lengths longer than 128
bits.
This series is based on the master branch.
Thanks,
George
George Steed (6):
mc-a-sve2.S: Fix addAvg_{16,32}xh_sve2 for longer SVE vectors
pixel-util-sve2.S: Fix accumulators in pixel_var_*_sve2
pixel-util-sve2.S: Fix branch target in pixel_sub_ps_64x64_sve2
asm-primitives.cpp: Delete dequant_scaling SVE2 implementation
blockcopy8-sve.S: Fix branch target in cpy1Dto2D_shr_32x32_sve
pixel-util-sve2.S: Fix normFact/ssimDist64 for longer SVE vectors
source/common/aarch64/asm-primitives.cpp | 3 +-
source/common/aarch64/blockcopy8-sve.S | 2 +-
source/common/aarch64/mc-a-sve2.S | 24 +-----
source/common/aarch64/pixel-util-sve2.S | 94 +++---------------------
4 files changed, 14 insertions(+), 109 deletions(-)
--
2.34.1
More information about the x265-devel
mailing list