[x265] cmake: avoid different stack alignment for GCC in 32-bit Windows

Mateusz mateusz at msystem.waw.pl
Thu Jul 28 13:14:46 CEST 2016


> Thanks Matuesz. Is there any description of this bug in gcc 6.0 that you can share so that we can understand the bug before we commit this code?
> Pradeep.
In Win32 entry functions (main, new threads) have 4 byte stack alignment. SSE code needs 16 byte stack alignment, so there is needed stack realign at some point. GCC doesn't assume 4 byte stack alignment on all functions -- when it switch to 16 byte stack alignment next functions (in calling chain) assume 16 byte stack alignment (it is faster). When you consider inline function and many optimize options it is easy for mistake (if the stack is 16 or 4 byte aligned).

All GCC are buggy -- you can reproduce the stack problem in any GCC (I've tested from 4.9.3 to 7.0.0) by changing default '-O3' optimize option:
--- a/source/CMakeLists.txt     Fri Jul 22 13:13:42 2016 +0530
+++ b/source/CMakeLists.txt     Thu Jul 28 12:44:22 2016 +0200
@@ -183,7 +183,7 @@
     elseif(X86 AND NOT X64)
         string(FIND "${CMAKE_CXX_FLAGS}" "-march" marchPos)
         if(marchPos LESS "0")
-            add_definitions(-march=i686)
+            add_definitions(-march=i686 -O2 -ftree-vectorize -fipa-cp-clone)
         endif()
     endif()
     if(ARM AND CROSS_COMPILE_ARM)

32-bit x265 compiled with this change hangs at:
Thread 5 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2828.0x54c]
0x0044f6ae in x265::SAO::saoLumaComponentParamDist(x265::SAOParam*, int, long long&, long long*, long long&) ()

The difference with GCC 6.1 is that it hangs with default '-O3' optimize option too.

There are two basic direction to avoid this bugs:
1) use 4 byte stack alignment when possible and realign to 16 byte only if it's needed (without complicated optimizations);
2) use 16 byte stack alignment as soon as possible -- realign stack on entry functions.

The patch with '-mpreferred-stack-boundary=2' make the 4 byte stack alignment preferred, so it simplify stack realign like in 1).

Point 2) is more effective and modern so the patch should not change anything if user set
export CXXFLAGS="-march=pentium4 -mtune=generic"
before it builds 32-bit GCC.

Mateusz


> On Thu, Jul 28, 2016 at 1:26 AM, Mateusz <mateusz at msystem.waw.pl <mailto:mateusz at msystem.waw.pl>> wrote:
>
>     > Since you just want to stack alignment, so -mpreferred-stack-boundary is best choice.
>     > The -msse2 will allow compiler automatic generate SSE2 code, since we use C++ code, it easy to catch compiler bugs, so I avoid to use these side effect option.
>     > --
>     > Min
>
>     OK, I've prepared new patch that adds -mpreferred-stack-boundary only for GCC 6 -- it doesn't slow-down working versions of GCC.
>
>     # HG changeset patch
>     # User Ma0 <mateuszb at poczta.onet.pl <mailto:mateuszb at poczta.onet.pl>>
>     # Date 1469648840 -7200
>     #      Wed Jul 27 21:47:20 2016 +0200
>     # Node ID 83e4505f87c0cbc0cf8869f9277fded426524734
>     # Parent  5a0e139e29386ecebafc9c555aedcd3e0f61c70c
>     cmake: fix unaligned stack in 32-bit Windows for GCC 6
>
>     diff -r 5a0e139e2938 -r 83e4505f87c0 source/CMakeLists.txt
>     --- a/source/CMakeLists.txt    Fri Jul 22 13:13:42 2016 +0530
>     +++ b/source/CMakeLists.txt    Wed Jul 27 21:47:20 2016 +0200
>     @@ -184,6 +184,10 @@
>              string(FIND "${CMAKE_CXX_FLAGS}" "-march" marchPos)
>              if(marchPos LESS "0")
>                  add_definitions(-march=i686)
>     +            if(WIN32 AND NOT INTEL_CXX AND NOT CLANG AND
>     +               CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 6.0 AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7.0)
>     +                add_definitions(-mpreferred-stack-boundary=2)
>     +            endif()
>              endif()
>          endif()
>          if(ARM AND CROSS_COMPILE_ARM)
>
>     _______________________________________________
>     x265-devel mailing list
>     x265-devel at videolan.org <mailto:x265-devel at videolan.org>
>     https://mailman.videolan.org/listinfo/x265-devel
>
>
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel



More information about the x265-devel mailing list