[x265] [PATCH 0/6] AArch64: Fix SVE(2) kernels for vectors >128 bits
George Steed
george.steed at arm.com
Tue Jan 7 16:03:08 UTC 2025
Hi Chen,
Thanks for taking a look.
I'm not sure what you mean by smoke tests, and I don't have permission
to push to the x265 repo.
In addition to the TestBench I have also tested running an x265 encode
on all possible SVE vector lengths and comparing the md5sum hashes. All
vector lengths now produce the same hash, and this matches what was
previously produced for a 128-bit SVE vector length before my changes
(other vector lengths previously hanged forever in an infinite loop).
Thanks,
George
On Tue, Jan 07, 2025 at 03:42:21PM +0800, chen wrote:
> Hi,
>
>
>
>
> Thank for the fixes.
>
> It looks good.
>
> However, I have not environment to verify it, please run smoke-test before push, thanks.
>
> At 2025-01-07 01:16:04, "George Steed" <george.steed at arm.com> wrote:
> >Hi,
> >
> >There are existing SVE and SVE2 assembly implementations of some kernels
> >that inspect the SVE vector length to determine which kernel loop to
> >execute, however some of these have issues when running on vector
> >lengths other than 128 bits.
> >
> >This patch series fixes all issues that previously failed in the
> >TestBench executable when running with vector lengths longer than 128
> >bits.
> >
> >This series is based on the master branch.
> >
> >Thanks,
> >George
> >
> >George Steed (6):
> > mc-a-sve2.S: Fix addAvg_{16,32}xh_sve2 for longer SVE vectors
> > pixel-util-sve2.S: Fix accumulators in pixel_var_*_sve2
> > pixel-util-sve2.S: Fix branch target in pixel_sub_ps_64x64_sve2
> > asm-primitives.cpp: Delete dequant_scaling SVE2 implementation
> > blockcopy8-sve.S: Fix branch target in cpy1Dto2D_shr_32x32_sve
> > pixel-util-sve2.S: Fix normFact/ssimDist64 for longer SVE vectors
> >
> > source/common/aarch64/asm-primitives.cpp | 3 +-
> > source/common/aarch64/blockcopy8-sve.S | 2 +-
> > source/common/aarch64/mc-a-sve2.S | 24 +-----
> > source/common/aarch64/pixel-util-sve2.S | 94 +++---------------------
> > 4 files changed, 14 insertions(+), 109 deletions(-)
> >
> >--
> >2.34.1
> >
> >_______________________________________________
> >x265-devel mailing list
> >x265-devel at videolan.org
> >https://mailman.videolan.org/listinfo/x265-devel
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
More information about the x265-devel
mailing list