[vlc-devel] [PATCH] avx2 work re-submission
Lyndon Brown
jnqnfe at gmail.com
Wed Oct 14 22:24:04 CEST 2020
With CI now enabled, a couple of issues have been encountered on some
systems.
1) The following error occurs on android-x86, android-x86_64, macos and
ios-simulator-x86_64:
../../modules/packetizer/startcode_helper.h:91:41: error: invalid input
size for constraint 'x'
: [v]"r"(p), [czero]"x"(zeros)
https://code.videolan.org/jnqnfe/vlc/-/pipelines/26244
This is for the AVX2 variant of the SSE2 vector-sized asm variant. The
gcc docs say 'x' is for an SSE register (xmm), but there's no clear
constraint for an AVX (ymm) register and I've seen suggestion that 'x'
is correct also for ymm. It works fine for some platforms, so why not
these.
I was stuck and ended up just disabling the code here in a new commit
pushed to the tree, thus just using the non-vector-size based asm.
2) With that work around in place, one further issue crops up failing
android-x86 only:
../../modules/video_filter/deinterlace/algo_x.c:198:17: error: inline
assembly requires more registers than available
"movd %2, %%xmm0\n"
^
https://code.videolan.org/jnqnfe/vlc/-/pipelines/26319
Firstly note that my work includes ditching a messy asm construction
layer used in deinterlace, which created individual asm blocks for
individual instructions. thus the asm is now grouped into proper asm
blocks, which is likely a factor here in terms of number of registers
used per block.
It seems that the error may relate to the android ndk switching to
clang and -mstackrealign being passed to clang (
https://github.com/android/ndk/issues/635), though they later reworked
things to possibly no longer need that (
https://android-review.googlesource.com/c/platform/bionic/+/615665/).
I'm wondering whether if we could move to a newer NDK and if that would
solve things. There were hints in some discussions that this helped. It
would certainly be preferable to trying to completely rework the code.
More information about the vlc-devel
mailing list