[x264-devel] [Git][videolan/x264][master] aarch64: Use regular hwcaps flags instead of HWCAP_CPUID for CPU feature detection on Linux
Anton Mitrofanov (@BugMaster)
gitlab at videolan.org
Wed Feb 28 22:32:24 UTC 2024
Anton Mitrofanov pushed to branch master at VideoLAN / x264
Commits:
be4f0200 by Martin Storsjö at 2024-02-28T22:26:17+00:00
aarch64: Use regular hwcaps flags instead of HWCAP_CPUID for CPU feature detection on Linux
This makes the code much simpler (especially for adding support
for other instruction set extensions), avoids needing inline
assembly for this feature, and generally is more of the canonical
way to do this.
The CPU feature detection was added in
9c3c71688226fbb23f4d36399fab08f018e760b0, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features always come later. This allows detecting support
for new CPU extensions before the kernel exposes information about
them via hwcap flags.
However in practice, there's probably quite little advantage in this.
E.g. HWCAP_SVE was added in Linux v4.15, and HWCAP2_SVE2 was added in
v5.10 - later than HWCAP_CPUID, but there's probably very little
practical cases where one would run a kernel older than that on a CPU
that supports those instructions.
Additionally, we provide our own definitions of the flag values to
check (as they are fixed constants anyway), with names not conflicting
with the ones from system headers. This reduces the number of ifdefs
needed, and allows detecting those features even if building with
userland headers that are lacking the definitions of those flags.
Also, slightly older versions of QEMU, e.g. 6.2 in Ubuntu 22.04,
do expose support for these features via HWCAP flags, but the
emulated cpuid registers are missing the bits for exposing e.g. SVE2
(This issue is fixed in later versions of QEMU though.)
Also drop the ifdef check for whether AT_HWCAP is defined; it was
added to glibc in 1997. AT_HWCAP2 was added in 2013, in glibc 2.18,
which also precedes when aarch64 was commonly used anyway, so
don't guard the use of that with an ifdef.
- - - - -
1 changed file:
- common/cpu.c
Changes:
=====================================
common/cpu.c
=====================================
@@ -423,32 +423,19 @@ uint32_t x264_cpu_detect( void )
#ifdef __linux__
#include <sys/auxv.h>
-#define get_cpu_feature_reg( reg, val ) \
- __asm__( "mrs %0, " #reg : "=r" ( val ) )
+#define HWCAP_AARCH64_SVE (1 << 22)
+#define HWCAP2_AARCH64_SVE2 (1 << 1)
static uint32_t detect_flags( void )
{
uint32_t flags = 0;
-#if defined( AT_HWCAP ) && defined( HWCAP_CPUID )
unsigned long hwcap = getauxval( AT_HWCAP );
- if ( hwcap & HWCAP_CPUID ) {
- // We could check for support directly with HWCAP_SVE and HWCAP2_SVE2,
- // but those were added into headers much later. By using direct
- // register access, we can detect these features even if compiled with
- // slightly older userland headers.
- // https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html
- uint64_t tmp;
- get_cpu_feature_reg( ID_AA64PFR0_EL1, tmp );
- if ( ( ( tmp >> 32 ) & 0xf ) == 0x1 ) {
- flags |= X264_CPU_SVE;
-
- get_cpu_feature_reg( S3_0_C0_C4_4, tmp ); // ID_AA64ZFR0_EL1
- if ( ( ( tmp >> 0 ) & 0xf ) == 0x1 )
- flags |= X264_CPU_SVE2;
- }
- }
-#endif
+ unsigned long hwcap2 = getauxval( AT_HWCAP2 );
+ if ( hwcap & HWCAP_AARCH64_SVE )
+ flags |= X264_CPU_SVE;
+ if ( hwcap2 & HWCAP2_AARCH64_SVE2 )
+ flags |= X264_CPU_SVE2;
return flags;
}
View it on GitLab: https://code.videolan.org/videolan/x264/-/commit/be4f0200ed007c466fd96185c39cde2a2d60ef50
--
View it on GitLab: https://code.videolan.org/videolan/x264/-/commit/be4f0200ed007c466fd96185c39cde2a2d60ef50
You're receiving this email because of your account on code.videolan.org.
VideoLAN code repository instance
More information about the x264-devel
mailing list