[x264-devel] Failure to build x264 with ASM on i386

BugMaster BugMaster at narod.ru
Sun Feb 23 20:56:42 CET 2014


On Sat, 22 Feb 2014 21:54:48 -0500, Brad Smith wrote:
> On 17/02/14 7:43 PM, Brad Smith wrote:
>> On 04/02/14 5:04 PM, Loren Merritt wrote:
>>> On Tue, 4 Feb 2014, Martin Storsjö wrote:
>>>> On Mon, 3 Feb 2014, Dimitry Andric wrote:
>>>>> On 03 Feb 2014, at 18:39, Loren Merritt <lorenm at u.washington.edu>
>>>>> wrote:
>>>>> ...
>>>>>> Otoh, gcc works with -fPIC. I can confirm that 5 registers is
>>>>>> enough for
>>>>>> the inline asm blocks in question. If clang thinks they need more
>>>>>> than 5
>>>>>> registers, that's a bug in clang's register allocator.
>>>>>
>>>>> No, gcc uses 7 registers.  For example, with -fPIC and gcc 4.8, the
>>>>> allocation for x264_predictor_clip_mmx2() is as follows:
>>>>>
>>>>> %0 = %eax
>>>>> %1 = %edx
>>>>> %2 = %ecx
>>>>> %4 = %ebp
>>>>> %5 = %esi
>>>>> %6 = %edi
>>>>> %7 = %ebx
>>>
>>> I tried with gcc-4.6.3 -fPIC -fno-omit-frame-pointer
>>> -mpreferred-stack-boundary=2
>>> (Although as far as register pressure goes, unaligned stack just
>>> forces it to
>>> use a frame-pointer, and thus isn't any worse than
>>> -fno-omit-frame-pointer alone.)
>>>
>>> %0 = %ecx
>>> %1 = nothing (or you could call it the same reg as %5)
>>> %2 = %eax
>>> %3 = %edx
>>> %4 = %edi
>>> %5 = %esi
>>> %6 = on the stack
>>> %7 = %ebx (this is the same reg that -fPIC itself reserves, and thus
>>>       doesn't count against the 5 regs left after PIC.)
>>> %8 = nothing (or you could call it the same reg as %0)
>>>
>>>>> The reason is that gcc assumes a 16 byte stack alignment on i386, which
>>>>> is only valid for Linux after ~2006, not most BSDs.  If you force
>>>>> gcc to
>>>>> assume a 4 byte stack alignment, it also cannot compile the inline
>>>>> assembly:
>>>>
>>>> FWIW, the similar cases within libav are handled by adding
>>>> __attribute__((force_align_arg_pointer)) to all public entry points
>>>> into the
>>>> libraries, which adds a special prologue to these functions that
>>>> realign the
>>>> stack to 16 bytes, and adding -mincoming-stack-boundary=4 to the cflags,
>>>> telling the compiler to assume a 16 byte aligned stack in all
>>>> functions, so
>>>> only the public ones need to care about fixing the alignment.
>>>
>>> x264 does in fact realign the stack at all entrypoints (using yasm rather
>>> than force_align_arg_pointer since we added that feature before
>>> force_align_arg_pointer existed). And yes we do that so that we can
>>> tell the compiler to assume aligned stack even if the OS doesn't provide
>>> it. I didn't previously know that about BSD, but Win32 also doesn't.
>>>
>>> --Loren Merritt
>>
>> Still looking for any possible options that could result in being able
>> to build x264 on i386. I haven't seen any suggestions yet that would
>> help.

> ping.

Your question was already answered by Loren Merritt so you should
"give up":
 - give up on --enable-pic as x264's asm optimizations for i386
 wouldn't be PIC anyway (even if you didn't get this compilation
 error) and will have textrels;
 - give up on asm and use --disable-asm if you for some stupid reason
 no matter what need PIC;
 - give up on i386 and build for x86_64 if need both PIC and asm.

P.S. You can also give up on clang and use gcc but that wouldn't
change the fact that resulting library wouldn't be really PIC.



More information about the x264-devel mailing list