[x265] [PATCH 0/2] Add Neon impl of planecopy_cp and weight_pp
Micro Daryl Robles
MicroDaryl.Robles at arm.com
Tue Apr 8 13:49:32 UTC 2025
Hi Chen,
The performance in the commitmsg of weight_pp for SBD is relative to the removed Neon asm.
The new intrinsics code is faster when CTZ(w0) < shift (e.g. w0 = 127, shift = 6) and equal in other cases.
Relative performance compared to Neon asm [SBD]:
(w0 = 64)
Neoverse N1: 1.19x
Neoverse N2: 1.00x
Neoverse V1: 1.10x
Neoverse V2: 1.01x
(w0 = 127)
Neoverse N1: 3.05x
Neoverse N2: 3.63x
Neoverse V1: 3.25x
Neoverse V2: 3.58x
Regards,
Micro
From: chen <chenm003 at 163.com>
Date: Tuesday, 8 April 2025 at 03:32
To: Development for x265 <x265-devel at videolan.org>
Cc: nd <nd at arm.com>, Micro Daryl Robles <MicroDaryl.Robles at arm.com>
Subject: Re:[x265] [PATCH 0/2] Add Neon impl of planecopy_cp and weight_pp
Hi Micro,
How about performance compare to removed weight_pp_neon?
Regards,
Chen
At 2025-04-07 18:57:14, "Micro Daryl Robles" <microdaryl.robles at arm.com> wrote:
>Hi,
>
>This patch series adds Neon intrinsic implementations of
>planecopy_cp and weight_pp that work for both SBD and HBD.
>
>This series is based on the master branch.
>
>Many thanks,
>Micro
>
>Micro Daryl Robles (2):
> AArch64: Add SBD and HBD Neon implementation of planecopy_cp
> AArch64: Add SBD and HBD Neon implementation of weight_pp
>
> source/common/aarch64/asm-primitives.cpp | 4 -
> source/common/aarch64/fun-decls.h | 3 -
> source/common/aarch64/pixel-prim.cpp | 183 +++++++++++++++++++++++
> source/common/aarch64/pixel-util.S | 144 ------------------
> source/test/pixelharness.cpp | 50 +++++--
> 5 files changed, 217 insertions(+), 167 deletions(-)
>
>--
>2.34.1
>
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250408/715dbe80/attachment.htm>
More information about the x265-devel
mailing list