[vlc-devel] [PATCH 1/3] packetizer: startcode_helper: prefer intrinsics
Rémi Denis-Courmont
remi at remlab.net
Thu Sep 10 22:35:59 CEST 2020
Le torstaina 10. syyskuuta 2020, 23.18.19 EEST Francois Cartegnie a écrit :
> There's no universal cross compiler way to store
> a vector constant have dependency listed in the next
> assembly block.
There are no ways, period. I don't think any compiler provides any guarantees
that registers are preserved across assembler blocks.
> With pure assembly, there's risk
> of clobbering constant if the compiler auto vector
> the TRY_MATCH sections.
I don't get this. If you need a value from one assembler block to another, you
need to assign it as an output of the first block, and an input (or output/
input) of the following blocks.
That's not a reason as such to switch to/from inline assembler or intrinsics.
> ---
> modules/packetizer/startcode_helper.h | 24 ++++++++++--------------
> 1 file changed, 10 insertions(+), 14 deletions(-)
>
> diff --git a/modules/packetizer/startcode_helper.h
> b/modules/packetizer/startcode_helper.h index 2b61e5cf98..fd7b8249d2 100644
> --- a/modules/packetizer/startcode_helper.h
> +++ b/modules/packetizer/startcode_helper.h
> @@ -22,7 +22,7 @@
>
> #include <vlc_cpu.h>
>
> -#if !defined(CAN_COMPILE_SSE2) && defined(HAVE_SSE2_INTRINSICS)
> +#if defined(HAVE_SSE2_INTRINSICS)
> #include <emmintrin.h>
> #endif
>
> @@ -63,30 +63,26 @@ static inline const uint8_t * startcode_FindAnnexB_SSE2(
> const uint8_t *p, const alignedend = end - ((intptr_t) end & 15);
> if( alignedend > p )
> {
> -#ifdef CAN_COMPILE_SSE2
> - asm volatile(
> - "pxor %%xmm1, %%xmm1\n"
> - ::: "xmm1"
> - );
> -#else
> - __m128i zeros = _mm_set1_epi8( 0x00 );
> +#ifdef HAVE_SSE2_INTRINSICS
> + __m128i zeros = _mm_set1_epi8( 0x00 );
> #endif
> for( ; p < alignedend; p += 16)
> {
> uint32_t match;
> -#ifdef CAN_COMPILE_SSE2
> +#ifdef HAVE_SSE2_INTRINSICS
> + __m128i v = _mm_load_si128((__m128i*)p);
> + __m128i res = _mm_cmpeq_epi8( zeros, v );
> + match = _mm_movemask_epi8( res ); /* mask will be in reversed
> match order */ +#else
> asm volatile(
> + "pxor %%xmm1, %%xmm1\n"
> "movdqa 0(%[v]), %%xmm0\n"
> "pcmpeqb %%xmm1, %%xmm0\n"
> "pmovmskb %%xmm0, %[match]\n"
>
> : [match]"=r"(match)
> : [v]"r"(p)
>
> - : "xmm0"
> + : "xmm0", "xmm1"
> );
> -#else
> - __m128i v = _mm_load_si128((__m128i*)p);
> - __m128i res = _mm_cmpeq_epi8( zeros, v );
> - match = _mm_movemask_epi8( res ); /* mask will be in reversed
> match order */ #endif
> if( match & 0x000F )
> TRY_MATCH(p, 0);
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
More information about the vlc-devel
mailing list