[x265] ASM crash in r6706

Jason Garrett-Glaser jason at x264.com
Thu Apr 17 02:05:22 CEST 2014


On Wed, Apr 16, 2014 at 11:08 AM, Steve Borho <steve at borho.org> wrote:
> On Wed, Apr 16, 2014 at 12:57 PM, Steve Borho <steve at borho.org> wrote:
>> On Wed, Apr 16, 2014 at 12:28 PM, Jason Garrett-Glaser <jason at x264.com> wrote:
>>>> x264 manually aligns their stack (see x264_stack_align())
>>>
>>> Note that this is because we don't require gcc's
>>> force_arg_align_stack_pointer intrinsic (added in 4.2). The stack
>>> isn't guaranteed to be aligned on thread entry or library entry on
>>> x86_32, so even if gcc is told to keep stack alignment, it can only
>>> keep stack alignment relative to the entry point, hence we have to
>>> manually align the entry point.
>>>
>>> Note that this is important even if you don't use x264asm, because
>>> using alignment intrinsics on stack-allocated arrays does not work if
>>> the entry point hasn't been aligned.
>>>
>>> x264asm's stack alignment /should/ align relative to the values being
>>> pushed onto the stack -- if it isn't, something's probably very wrong.
>>> I'm going to assume it's not broken, since it's used for a good bit of
>>> ffmpeg and x264 code, but if it is broken, poke the x264 mailing list
>>> with assembly code and the (wrong) assembled output.
>>
>> Would it be appropriate to simply add "-mpreferred-stack-boundary=4"
>> for x86_32 compiles if GCC is new enough to support it, or set
>> HAVE_ALIGNED_STACK=0 if it is too old?
>
> Oh, your point about library entry points finally has sunk in.  We
> need to either enforce stack alignment internally on all our thread or
> API entry points on x86_32 builds, or we have to set
> HAVE_ALIGNED_STACK=0 unconditionally for those builds.

That's not correct either.  HAVE_ALIGNED_STACK=0 is still not
sufficient -- all your stack-declared aligned arrays will be
misaligned and things may crash anyways. You must enforce stack
alignment at all thread starts and entry points (well, only the ones
that actually lead to asm code -- in x264 this is only like two
places, so it's not a huge deal!). osdep.h in x264 covers stack
alignment for stack-declared arrays which require higher alignment
than the stack's native alignment.

Note something important:

1.  The way gcc enforces stack alignment for aligned elements on the
stack is to keep the stack aligned between functions, then align
elements on the stack relative to the aligned stack. This fits with
the Linux userspace convention that stacks should be 16-byte aligned
on x86_32.

2.  On ICC (and I presume MSVC), instead, data on the stack is aligned
explicitly, without making assumptions about the stack pointer's
alignment. This is slower, but works without such a convention.

So this is a non-MS compiler issue only. Note this comes up double
e.g. on ARM, where the compiler does not even support anything above
8-byte alignment, or worse, 4-byte alignment on iOS's ABI. This is why
osdep.h has those workarounds.

Jason


More information about the x265-devel mailing list