[x265] [PATCH 01/11] Avoid aliasing HBD SSE_PP functions for AArch64 platforms

Tue Dec 10 16:00:20 UTC 2024

Avoid using AArch64 sse_ss implementation for sse_pp in high
bit-depth builds. Even though the parameter for both functions is
16-bit, sse_ss uses signed parameters as opposed to sse_pp which uses
unsigned ones. Because of this difference, in subsequent patches
different AArch64 implementations will be added for these functions.
The sign bit matters in the context of what the maximum value of SSE
can be and the number of accumulator registers.

Change-Id: I21ac7249f8968eedca2da39f7cd11b6489bac6cc
---
 source/common/primitives.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/source/common/primitives.cpp b/source/common/primitives.cpp
index 4db706d74..83ebc455e 100644
--- a/source/common/primitives.cpp
+++ b/source/common/primitives.cpp
@@ -91,7 +91,9 @@ void setupAliasPrimitives(EncoderPrimitives &p)
     /* at HIGH_BIT_DEPTH, pixel == short so we can alias many primitives */
     for (int i = 0; i < NUM_CU_SIZES; i++)
     {
+#if !defined(X265_ARCH_ARM64)
         p.cu[i].sse_pp = (pixel_sse_t)p.cu[i].sse_ss;
+#endif
 
         p.cu[i].copy_ps = (copy_ps_t)p.pu[i].copy_pp;
         p.cu[i].copy_sp = (copy_sp_t)p.pu[i].copy_pp;
-- 
2.39.5 (Apple Git-154)

-------------- next part --------------
>From 10709c48d63598f106324e056a82794beb76886f Mon Sep 17 00:00:00 2001
Message-Id: <10709c48d63598f106324e056a82794beb76886f.1733846134.git.gerdazsejke.more at arm.com>
In-Reply-To: <cover.1733846134.git.gerdazsejke.more at arm.com>
References: <cover.1733846134.git.gerdazsejke.more at arm.com>
From: Gerda Zsejke More <gerdazsejke.more at arm.com>
Date: Wed, 4 Dec 2024 08:57:09 +0100
Subject: [PATCH 01/11] Avoid aliasing HBD SSE_PP functions for AArch64
 platforms

Avoid using AArch64 sse_ss implementation for sse_pp in high
bit-depth builds. Even though the parameter for both functions is
16-bit, sse_ss uses signed parameters as opposed to sse_pp which uses
unsigned ones. Because of this difference, in subsequent patches
different AArch64 implementations will be added for these functions.
The sign bit matters in the context of what the maximum value of SSE
can be and the number of accumulator registers.

Change-Id: I21ac7249f8968eedca2da39f7cd11b6489bac6cc
---
 source/common/primitives.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/source/common/primitives.cpp b/source/common/primitives.cpp
index 4db706d74..83ebc455e 100644
--- a/source/common/primitives.cpp
+++ b/source/common/primitives.cpp
@@ -91,7 +91,9 @@ void setupAliasPrimitives(EncoderPrimitives &p)
     /* at HIGH_BIT_DEPTH, pixel == short so we can alias many primitives */
     for (int i = 0; i < NUM_CU_SIZES; i++)
     {
+#if !defined(X265_ARCH_ARM64)
         p.cu[i].sse_pp = (pixel_sse_t)p.cu[i].sse_ss;
+#endif
 
         p.cu[i].copy_ps = (copy_ps_t)p.pu[i].copy_pp;
         p.cu[i].copy_sp = (copy_sp_t)p.pu[i].copy_pp;
-- 
2.39.5 (Apple Git-154)