[x265] [PATCH 0/2] Add Neon impl of planecopy_cp and weight_pp

Micro Daryl Robles MicroDaryl.Robles at arm.com
Tue Apr 8 13:49:32 UTC 2025


Hi Chen,

The performance in the commitmsg of weight_pp for SBD is relative to the removed Neon asm.
The new intrinsics code is faster when CTZ(w0) < shift (e.g. w0 = 127, shift = 6) and equal in other cases.

Relative performance compared to Neon asm [SBD]:
 (w0 = 64)
 Neoverse N1: 1.19x
 Neoverse N2: 1.00x
 Neoverse V1: 1.10x
 Neoverse V2: 1.01x
 (w0 = 127)
 Neoverse N1: 3.05x
 Neoverse N2: 3.63x
 Neoverse V1: 3.25x
 Neoverse V2: 3.58x

Regards,
Micro
From: chen <chenm003 at 163.com>
Date: Tuesday, 8 April 2025 at 03:32
To: Development for x265 <x265-devel at videolan.org>
Cc: nd <nd at arm.com>, Micro Daryl Robles <MicroDaryl.Robles at arm.com>
Subject: Re:[x265] [PATCH 0/2] Add Neon impl of planecopy_cp and weight_pp

Hi Micro,



How about performance compare to removed weight_pp_neon?



Regards,

Chen



At 2025-04-07 18:57:14, "Micro Daryl Robles" <microdaryl.robles at arm.com> wrote:

>Hi,

>

>This patch series adds Neon intrinsic implementations of

>planecopy_cp and weight_pp that work for both SBD and HBD.

>

>This series is based on the master branch.

>

>Many thanks,

>Micro

>

>Micro Daryl Robles (2):

>  AArch64: Add SBD and HBD Neon implementation of planecopy_cp

>  AArch64: Add SBD and HBD Neon implementation of weight_pp

>

> source/common/aarch64/asm-primitives.cpp |   4 -

> source/common/aarch64/fun-decls.h        |   3 -

> source/common/aarch64/pixel-prim.cpp     | 183 +++++++++++++++++++++++

> source/common/aarch64/pixel-util.S       | 144 ------------------

> source/test/pixelharness.cpp             |  50 +++++--

> 5 files changed, 217 insertions(+), 167 deletions(-)

>

>--

>2.34.1

>

>_______________________________________________

>x265-devel mailing list

>x265-devel at videolan.org

>https://mailman.videolan.org/listinfo/x265-devel

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250408/715dbe80/attachment.htm>


More information about the x265-devel mailing list