[x265] [PATCH 0/2] Add Neon impl of findPosFirstLast

Micro Daryl Robles MicroDaryl.Robles at arm.com
Wed Apr 9 17:39:57 UTC 2025


Hi Chen,

Thank you for the approval.

Yes, EOR and ADD have the same instruction cost (latency / throughput / pipeline) in AArch64 across Neoverse (Nx, Vx), so using either should be optimal.

Regards,
Micro

From: chen <chenm003 at 163.com>
Date: Wednesday, 9 April 2025 at 07:10
To: Development for x265 <x265-devel at videolan.org>
Cc: nd <nd at arm.com>, Micro Daryl Robles <MicroDaryl.Robles at arm.com>
Subject: Re:[x265] [PATCH 0/2] Add Neon impl of findPosFirstLast

Hi Micro,



The code looks good to me, I have no more comment, thank you.



btw: for absSumSign, does EOR and ADD same instruction cost on Neoverse?



Regards,
Chen



At 2025-04-08 23:13:29, "Micro Daryl Robles" <microdaryl.robles at arm.com> wrote:

>Hi,

>

>This patch series adds a Neon intrinsic implementation of

>findPosFirstLast.

>

>Also, we are submitting a proposal to rename CLZ/CTZ to BSR/BSF, as the

>current CLZ macro does not actually count leading zeros. Instead, it

>returns the index of the highest set bit, which aligns with the behavior

>of BSR.

>

>This series is based on the master branch.

>

>Many thanks,

>Micro

>

>Micro Daryl Robles (2):

>  AArch64: Add Neon implementation of findPosFirstLast

>  Rename CLZ/CTZ to BSR/BSF

>

> source/common/aarch64/dct-prim.cpp  | 55 ++++++++++++++++++++++++++++-

> source/common/aarch64/dct-prim.h    |  2 +-

> source/common/bitstream.cpp         |  2 +-

> source/common/dct.cpp               |  4 +--

> source/common/ppc/dct_altivec.cpp   |  2 +-

> source/common/quant.cpp             |  8 ++---

> source/common/threading.h           | 18 ++++++----

> source/common/threadpool.cpp        | 10 +++---

> source/common/wavefront.cpp         |  2 +-

> source/common/x86/pixel-util8.asm   |  4 +--

> source/encoder/entropy.cpp          | 10 +++---

> source/encoder/frameencoder.cpp     |  4 +--

> source/encoder/slicetype.cpp        |  2 +-

> source/encoder/weightPrediction.cpp |  2 +-

> source/test/pixelharness.cpp        | 12 +++++--

> 15 files changed, 102 insertions(+), 35 deletions(-)

>

>--

>2.34.1

>

>_______________________________________________

>x265-devel mailing list

>x265-devel at videolan.org

>https://mailman.videolan.org/listinfo/x265-devel

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250409/ee1205ca/attachment.htm>


More information about the x265-devel mailing list