[x265] [PATCH 0/6] AArch64: Fix SVE(2) kernels for vectors >128 bits

George Steed george.steed at arm.com
Mon Jan 6 17:16:04 UTC 2025


Hi,

There are existing SVE and SVE2 assembly implementations of some kernels
that inspect the SVE vector length to determine which kernel loop to
execute, however some of these have issues when running on vector
lengths other than 128 bits.

This patch series fixes all issues that previously failed in the
TestBench executable when running with vector lengths longer than 128
bits.

This series is based on the master branch.

Thanks,
George

George Steed (6):
  mc-a-sve2.S: Fix addAvg_{16,32}xh_sve2 for longer SVE vectors
  pixel-util-sve2.S: Fix accumulators in pixel_var_*_sve2
  pixel-util-sve2.S: Fix branch target in pixel_sub_ps_64x64_sve2
  asm-primitives.cpp: Delete dequant_scaling SVE2 implementation
  blockcopy8-sve.S: Fix branch target in cpy1Dto2D_shr_32x32_sve
  pixel-util-sve2.S: Fix normFact/ssimDist64 for longer SVE vectors

 source/common/aarch64/asm-primitives.cpp |  3 +-
 source/common/aarch64/blockcopy8-sve.S   |  2 +-
 source/common/aarch64/mc-a-sve2.S        | 24 +-----
 source/common/aarch64/pixel-util-sve2.S  | 94 +++---------------------
 4 files changed, 14 insertions(+), 109 deletions(-)

--
2.34.1



More information about the x265-devel mailing list