[x264-devel] Failure to build x264 with ASM on i386

Brad Smith brad at comstyle.com
Sun Feb 23 03:54:48 CET 2014


On 17/02/14 7:43 PM, Brad Smith wrote:
> On 04/02/14 5:04 PM, Loren Merritt wrote:
>> On Tue, 4 Feb 2014, Martin Storsjö wrote:
>>> On Mon, 3 Feb 2014, Dimitry Andric wrote:
>>>> On 03 Feb 2014, at 18:39, Loren Merritt <lorenm at u.washington.edu>
>>>> wrote:
>>>> ...
>>>>> Otoh, gcc works with -fPIC. I can confirm that 5 registers is
>>>>> enough for
>>>>> the inline asm blocks in question. If clang thinks they need more
>>>>> than 5
>>>>> registers, that's a bug in clang's register allocator.
>>>>
>>>> No, gcc uses 7 registers.  For example, with -fPIC and gcc 4.8, the
>>>> allocation for x264_predictor_clip_mmx2() is as follows:
>>>>
>>>> %0 = %eax
>>>> %1 = %edx
>>>> %2 = %ecx
>>>> %4 = %ebp
>>>> %5 = %esi
>>>> %6 = %edi
>>>> %7 = %ebx
>>
>> I tried with gcc-4.6.3 -fPIC -fno-omit-frame-pointer
>> -mpreferred-stack-boundary=2
>> (Although as far as register pressure goes, unaligned stack just
>> forces it to
>> use a frame-pointer, and thus isn't any worse than
>> -fno-omit-frame-pointer alone.)
>>
>> %0 = %ecx
>> %1 = nothing (or you could call it the same reg as %5)
>> %2 = %eax
>> %3 = %edx
>> %4 = %edi
>> %5 = %esi
>> %6 = on the stack
>> %7 = %ebx (this is the same reg that -fPIC itself reserves, and thus
>>       doesn't count against the 5 regs left after PIC.)
>> %8 = nothing (or you could call it the same reg as %0)
>>
>>>> The reason is that gcc assumes a 16 byte stack alignment on i386, which
>>>> is only valid for Linux after ~2006, not most BSDs.  If you force
>>>> gcc to
>>>> assume a 4 byte stack alignment, it also cannot compile the inline
>>>> assembly:
>>>
>>> FWIW, the similar cases within libav are handled by adding
>>> __attribute__((force_align_arg_pointer)) to all public entry points
>>> into the
>>> libraries, which adds a special prologue to these functions that
>>> realign the
>>> stack to 16 bytes, and adding -mincoming-stack-boundary=4 to the cflags,
>>> telling the compiler to assume a 16 byte aligned stack in all
>>> functions, so
>>> only the public ones need to care about fixing the alignment.
>>
>> x264 does in fact realign the stack at all entrypoints (using yasm rather
>> than force_align_arg_pointer since we added that feature before
>> force_align_arg_pointer existed). And yes we do that so that we can
>> tell the compiler to assume aligned stack even if the OS doesn't provide
>> it. I didn't previously know that about BSD, but Win32 also doesn't.
>>
>> --Loren Merritt
>
> Still looking for any possible options that could result in being able
> to build x264 on i386. I haven't seen any suggestions yet that would
> help.

ping.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the x264-devel mailing list