<html><head></head><body>AFAIR, AC_SUBST invocation should not be conditional.<br><br><div class="gmail_quote">Le 26 janvier 2019 14:23:36 GMT+02:00, Janne Grunau <janne-vlc@jannau.net> a écrit :<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<pre class="k9mail">x86inc.asm copied from dav1d (8c5d34c85613) and x86util.asm from libav<br>(994c4bc10751). Libav's LGPL licensed x86util.asm is required for yadif.<br><br>This reverts "Remove unused support for .asm files"<br>commit 6c0f63cd6853c0d184a5abbf2e19c1626d2854ef.<hr> configure.ac | 28 +<br> extras/include/x86/x86inc.asm | 1742 ++++++++++++++++++++++++++++++++<br> extras/include/x86/x86util.asm | 705 +++++++++++++<br> modules/common.am | 5 +-<br> 4 files changed, 2479 insertions(+), 1 deletion(-)<br> create mode 100644 extras/include/x86/x86inc.asm<br> create mode 100644 extras/include/x86/x86util.asm<br><br>diff --git a/configure.ac b/configure.ac<br>index a2b8ade789..96d13fa1d2 100644<br>--- a/configure.ac<br>+++ b/configure.ac<br>@@ -95,6 +95,19 @@ HAVE_IOS="0"<br> HAVE_OSX="0"<br> HAVE_TVOS="0"<br> <br>+dnl Set x86 asm flags and defines<br>+X86ASMFLAGS=""<br>+case "${host_cpu}" in<br>+ i?86)<br>+ X86ASMFLAGS="-f elf32"<br>+ X86ASMDEFS="-DARCH_X86_32=1 -DARCH_X86_64=0"<br>+ ;;<br>+ x86_64)<br>+ X86ASMFLAGS="-f elf64"<br>+ X86ASMDEFS="-DARCH_X86_32=0 -DARCH_X86_64=1"<br>+ ;;<br>+esac<br>+<br> case "${host_os}" in<br> "")<br> SYS=unknown<br>@@ -132,6 +145,8 @@ case "${host_os}" in<br> case "${host_cpu}" in<br> i?86)<br> ARCH_flag="-arch i386"<br>+ X86ASMFLAGS="-f macho32"<br>+ X86ASMDEFS="${X86ASMDEFS} -DPREFIX"<br> ;;<br> ppc64*)<br> ARCH_flag="-arch ppc64"<br>@@ -141,6 +156,8 @@ case "${host_os}" in<br> ;;<br> x86_64)<br> ARCH_flag="-arch x86_64"<br>+ X86ASMFLAGS="-f macho64"<br>+ X86ASMDEFS="${X86ASMDEFS} -DPREFIX"<br> ;;<br> arm*)<br> ac_cv_c_bigendian="no"<br>@@ -259,10 +276,13 @@ case "${host_os}" in<br> WINDOWS_ARCH="x64"<br> PROGRAMFILES="PROGRAMFILES64"<br> LDFLAGS="${LDFLAGS} -Wl,--high-entropy-va -Wl,--image-base,0x140000000"<br>+ X86ASMFLAGS="-f win64"<br> ;;<br> *)<br> WINDOWS_ARCH="x86"<br> PROGRAMFILES="PROGRAMFILES"<br>+ X86ASMFLAGS="-f win32"<br>+ X86ASMDEFS="${X86ASMDEFS} -DPREFIX"<br> ;;<br> esac<br> AC_SUBST([WINDOWS_ARCH])<br>@@ -332,6 +352,14 @@ AM_CONDITIONAL([HAVE_WIN64], [test "${HAVE_WIN64}" = "1"]) dnl Only used for t<br> AM_CONDITIONAL([HAVE_WINSTORE], [test "$vlc_winstore_app" = "1"])<br> AM_CONDITIONAL([HAVE_WIN32_DESKTOP], [test "${SYS}" = "mingw32" -a "$vlc_winstore_app" = "0"])<br> <br>+dnl Use nasm/yasm only on x86<br>+AC_CHECK_PROGS(X86ASM, [nasm yasm])<br>+AM_CONDITIONAL([HAVE_X86ASM], [test -n "${X86ASM}" && test -n "{X86ASMFLAGS}"])<br>+AM_COND_IF([HAVE_X86ASM], [<br>+ AC_DEFINE([HAVE_X86ASM], [1], [Use external asm on x86.]),<br>+ AC_SUBST([X86ASMFLAGS]),<br>+ AC_SUBST([X86ASMDEFS])])<br>+<br> dnl<br> dnl Sadly autoconf does not think about testing foo.exe when ask to test<br> dnl for program foo on win32<br>diff --git a/extras/include/x86/x86inc.asm b/extras/include/x86/x86inc.asm<br>new file mode 100644<br>index 0000000000..b249f2a792<br>--- /dev/null<br>+++ b/extras/include/x86/x86inc.asm<br>@@ -0,0 +1,1742 @@<br>+;*****************************************************************************<br>+;* x86inc.asm: x264asm abstraction layer<br>+;*****************************************************************************<br>+;* Copyright (C) 2005-2018 x264 project<br>+;*<br>+;* Authors: Loren Merritt <lorenm@u.washington.edu><br>+;* Henrik Gramner <henrik@gramner.com><br>+;* Anton Mitrofanov <BugMaster@narod.ru><br>+;* Fiona Glaser <fiona@x264.com><br>+;*<br>+;* Permission to use, copy, modify, and/or distribute this software for any<br>+;* purpose with or without fee is hereby granted, provided that the above<br>+;* copyright notice and this permission notice appear in all copies.<br>+;*<br>+;* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES<br>+;* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF<br>+;* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR<br>+;* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES<br>+;* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN<br>+;* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF<br>+;* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.<br>+;*****************************************************************************<br>+<br>+; This is a header file for the x264ASM assembly language, which uses<br>+; NASM/YASM syntax combined with a large number of macros to provide easy<br>+; abstraction between different calling conventions (x86_32, win64, linux64).<br>+; It also has various other useful features to simplify writing the kind of<br>+; DSP functions that are most often used in x264.<br>+<br>+; Unlike the rest of x264, this file is available under an ISC license, as it<br>+; has significant usefulness outside of x264 and we want it to be available<br>+; to the largest audience possible. Of course, if you modify it for your own<br>+; purposes to add a new feature, we strongly encourage contributing a patch<br>+; as this feature might be useful for others as well. Send patches or ideas<br>+; to x264-devel@videolan.org .<br>+<br>+%ifndef private_prefix<br>+ %define private_prefix dav1d<br>+%endif<br>+<br>+%ifndef public_prefix<br>+ %define public_prefix private_prefix<br>+%endif<br>+<br>+%ifndef STACK_ALIGNMENT<br>+ %if ARCH_X86_64<br>+ %define STACK_ALIGNMENT 16<br>+ %else<br>+ %define STACK_ALIGNMENT 4<br>+ %endif<br>+%endif<br>+<br>+%define WIN64 0<br>+%define UNIX64 0<br>+%if ARCH_X86_64<br>+ %ifidn __OUTPUT_FORMAT__,win32<br>+ %define WIN64 1<br>+ %elifidn __OUTPUT_FORMAT__,win64<br>+ %define WIN64 1<br>+ %elifidn __OUTPUT_FORMAT__,x64<br>+ %define WIN64 1<br>+ %else<br>+ %define UNIX64 1<br>+ %endif<br>+%endif<br>+<br>+%define FORMAT_ELF 0<br>+%ifidn __OUTPUT_FORMAT__,elf<br>+ %define FORMAT_ELF 1<br>+%elifidn __OUTPUT_FORMAT__,elf32<br>+ %define FORMAT_ELF 1<br>+%elifidn __OUTPUT_FORMAT__,elf64<br>+ %define FORMAT_ELF 1<br>+%endif<br>+<br>+%ifdef PREFIX<br>+ %define mangle(x) _ %+ x<br>+%else<br>+ %define mangle(x) x<br>+%endif<br>+<br>+%macro SECTION_RODATA 0-1 16<br>+ %ifidn __OUTPUT_FORMAT__,win32<br>+ SECTION .rdata align=%1<br>+ %elif WIN64<br>+ SECTION .rdata align=%1<br>+ %else<br>+ SECTION .rodata align=%1<br>+ %endif<br>+%endmacro<br>+<br>+%if ARCH_X86_64<br>+ %define PIC 1 ; always use PIC on x86-64<br>+ default rel<br>+%elifidn __OUTPUT_FORMAT__,win32<br>+ %define PIC 0 ; PIC isn't used on 32-bit Windows<br>+%elifndef PIC<br>+ %define PIC 0<br>+%endif<br>+<br>+%ifdef __NASM_VER__<br>+ %use smartalign<br>+%endif<br>+<br>+; Macros to eliminate most code duplication between x86_32 and x86_64:<br>+; Currently this works only for leaf functions which load all their arguments<br>+; into registers at the start, and make no other use of the stack. Luckily that<br>+; covers most of x264's asm.<br>+<br>+; PROLOGUE:<br>+; %1 = number of arguments. loads them from stack if needed.<br>+; %2 = number of registers used. pushes callee-saved regs if needed.<br>+; %3 = number of xmm registers used. pushes callee-saved xmm regs if needed.<br>+; %4 = (optional) stack size to be allocated. The stack will be aligned before<br>+; allocating the specified stack size. If the required stack alignment is<br>+; larger than the known stack alignment the stack will be manually aligned<br>+; and an extra register will be allocated to hold the original stack<br>+; pointer (to not invalidate r0m etc.). To prevent the use of an extra<br>+; register as stack pointer, request a negative stack size.<br>+; %4+/%5+ = list of names to define to registers<br>+; PROLOGUE can also be invoked by adding the same options to cglobal<br>+<br>+; e.g.<br>+; cglobal foo, 2,3,7,0x40, dst, src, tmp<br>+; declares a function (foo) that automatically loads two arguments (dst and<br>+; src) into registers, uses one additional register (tmp) plus 7 vector<br>+; registers (m0-m6) and allocates 0x40 bytes of stack space.<br>+<br>+; TODO Some functions can use some args directly from the stack. If they're the<br>+; last args then you can just not declare them, but if they're in the middle<br>+; we need more flexible macro.<br>+<br>+; RET:<br>+; Pops anything that was pushed by PROLOGUE, and returns.<br>+<br>+; REP_RET:<br>+; Use this instead of RET if it's a branch target.<br>+<br>+; registers:<br>+; rN and rNq are the native-size register holding function argument N<br>+; rNd, rNw, rNb are dword, word, and byte size<br>+; rNh is the high 8 bits of the word size<br>+; rNm is the original location of arg N (a register or on the stack), dword<br>+; rNmp is native size<br>+<br>+%macro DECLARE_REG 2-3<br>+ %define r%1q %2<br>+ %define r%1d %2d<br>+ %define r%1w %2w<br>+ %define r%1b %2b<br>+ %define r%1h %2h<br>+ %define %2q %2<br>+ %if %0 == 2<br>+ %define r%1m %2d<br>+ %define r%1mp %2<br>+ %elif ARCH_X86_64 ; memory<br>+ %define r%1m [rstk + stack_offset + %3]<br>+ %define r%1mp qword r %+ %1 %+ m<br>+ %else<br>+ %define r%1m [rstk + stack_offset + %3]<br>+ %define r%1mp dword r %+ %1 %+ m<br>+ %endif<br>+ %define r%1 %2<br>+%endmacro<br>+<br>+%macro DECLARE_REG_SIZE 3<br>+ %define r%1q r%1<br>+ %define e%1q r%1<br>+ %define r%1d e%1<br>+ %define e%1d e%1<br>+ %define r%1w %1<br>+ %define e%1w %1<br>+ %define r%1h %3<br>+ %define e%1h %3<br>+ %define r%1b %2<br>+ %define e%1b %2<br>+ %if ARCH_X86_64 == 0<br>+ %define r%1 e%1<br>+ %endif<br>+%endmacro<br>+<br>+DECLARE_REG_SIZE ax, al, ah<br>+DECLARE_REG_SIZE bx, bl, bh<br>+DECLARE_REG_SIZE cx, cl, ch<br>+DECLARE_REG_SIZE dx, dl, dh<br>+DECLARE_REG_SIZE si, sil, null<br>+DECLARE_REG_SIZE di, dil, null<br>+DECLARE_REG_SIZE bp, bpl, null<br>+<br>+; t# defines for when per-arch register allocation is more complex than just function arguments<br>+<br>+%macro DECLARE_REG_TMP 1-*<br>+ %assign %%i 0<br>+ %rep %0<br>+ CAT_XDEFINE t, %%i, r%1<br>+ %assign %%i %%i+1<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+%macro DECLARE_REG_TMP_SIZE 0-*<br>+ %rep %0<br>+ %define t%1q t%1 %+ q<br>+ %define t%1d t%1 %+ d<br>+ %define t%1w t%1 %+ w<br>+ %define t%1h t%1 %+ h<br>+ %define t%1b t%1 %+ b<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+DECLARE_REG_TMP_SIZE 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14<br>+<br>+%if ARCH_X86_64<br>+ %define gprsize 8<br>+%else<br>+ %define gprsize 4<br>+%endif<br>+<br>+%macro LEA 2<br>+%if ARCH_X86_64<br>+ lea %1, [%2]<br>+%elif PIC<br>+ call $+5 ; special-cased to not affect the RSB on most CPU:s<br>+ pop %1<br>+ add %1, (%2)-$+1<br>+%else<br>+ mov %1, %2<br>+%endif<br>+%endmacro<br>+<br>+%macro PUSH 1<br>+ push %1<br>+ %ifidn rstk, rsp<br>+ %assign stack_offset stack_offset+gprsize<br>+ %endif<br>+%endmacro<br>+<br>+%macro POP 1<br>+ pop %1<br>+ %ifidn rstk, rsp<br>+ %assign stack_offset stack_offset-gprsize<br>+ %endif<br>+%endmacro<br>+<br>+%macro PUSH_IF_USED 1-*<br>+ %rep %0<br>+ %if %1 < regs_used<br>+ PUSH r%1<br>+ %endif<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+%macro POP_IF_USED 1-*<br>+ %rep %0<br>+ %if %1 < regs_used<br>+ pop r%1<br>+ %endif<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+%macro LOAD_IF_USED 1-*<br>+ %rep %0<br>+ %if %1 < num_args<br>+ mov r%1, r %+ %1 %+ mp<br>+ %endif<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+%macro SUB 2<br>+ sub %1, %2<br>+ %ifidn %1, rstk<br>+ %assign stack_offset stack_offset+(%2)<br>+ %endif<br>+%endmacro<br>+<br>+%macro ADD 2<br>+ add %1, %2<br>+ %ifidn %1, rstk<br>+ %assign stack_offset stack_offset-(%2)<br>+ %endif<br>+%endmacro<br>+<br>+%macro movifnidn 2<br>+ %ifnidn %1, %2<br>+ mov %1, %2<br>+ %endif<br>+%endmacro<br>+<br>+%if ARCH_X86_64 == 0<br>+ %define movsxd movifnidn<br>+%endif<br>+<br>+%macro movsxdifnidn 2<br>+ %ifnidn %1, %2<br>+ movsxd %1, %2<br>+ %endif<br>+%endmacro<br>+<br>+%macro ASSERT 1<br>+ %if (%1) == 0<br>+ %error assertion ``%1'' failed<br>+ %endif<br>+%endmacro<br>+<br>+%macro DEFINE_ARGS 0-*<br>+ %ifdef n_arg_names<br>+ %assign %%i 0<br>+ %rep n_arg_names<br>+ CAT_UNDEF arg_name %+ %%i, q<br>+ CAT_UNDEF arg_name %+ %%i, d<br>+ CAT_UNDEF arg_name %+ %%i, w<br>+ CAT_UNDEF arg_name %+ %%i, h<br>+ CAT_UNDEF arg_name %+ %%i, b<br>+ CAT_UNDEF arg_name %+ %%i, m<br>+ CAT_UNDEF arg_name %+ %%i, mp<br>+ CAT_UNDEF arg_name, %%i<br>+ %assign %%i %%i+1<br>+ %endrep<br>+ %endif<br>+<br>+ %xdefine %%stack_offset stack_offset<br>+ %undef stack_offset ; so that the current value of stack_offset doesn't get baked in by xdefine<br>+ %assign %%i 0<br>+ %rep %0<br>+ %xdefine %1q r %+ %%i %+ q<br>+ %xdefine %1d r %+ %%i %+ d<br>+ %xdefine %1w r %+ %%i %+ w<br>+ %xdefine %1h r %+ %%i %+ h<br>+ %xdefine %1b r %+ %%i %+ b<br>+ %xdefine %1m r %+ %%i %+ m<br>+ %xdefine %1mp r %+ %%i %+ mp<br>+ CAT_XDEFINE arg_name, %%i, %1<br>+ %assign %%i %%i+1<br>+ %rotate 1<br>+ %endrep<br>+ %xdefine stack_offset %%stack_offset<br>+ %assign n_arg_names %0<br>+%endmacro<br>+<br>+%define required_stack_alignment ((mmsize + 15) & ~15)<br>+%define vzeroupper_required (mmsize > 16 && (ARCH_X86_64 == 0 || xmm_regs_used > 16 || notcpuflag(avx512)))<br>+%define high_mm_regs (16*cpuflag(avx512))<br>+<br>+%macro ALLOC_STACK 1-2 0 ; stack_size, n_xmm_regs (for win64 only)<br>+ %ifnum %1<br>+ %if %1 != 0<br>+ %assign %%pad 0<br>+ %assign stack_size %1<br>+ %if stack_size < 0<br>+ %assign stack_size -stack_size<br>+ %endif<br>+ %if WIN64<br>+ %assign %%pad %%pad + 32 ; shadow space<br>+ %if mmsize != 8<br>+ %assign xmm_regs_used %2<br>+ %if xmm_regs_used > 8<br>+ %assign %%pad %%pad + (xmm_regs_used-8)*16 ; callee-saved xmm registers<br>+ %endif<br>+ %endif<br>+ %endif<br>+ %if required_stack_alignment <= STACK_ALIGNMENT<br>+ ; maintain the current stack alignment<br>+ %assign stack_size_padded stack_size + %%pad + ((-%%pad-stack_offset-gprsize) & (STACK_ALIGNMENT-1))<br>+ SUB rsp, stack_size_padded<br>+ %else<br>+ %assign %%reg_num (regs_used - 1)<br>+ %xdefine rstk r %+ %%reg_num<br>+ ; align stack, and save original stack location directly above<br>+ ; it, i.e. in [rsp+stack_size_padded], so we can restore the<br>+ ; stack in a single instruction (i.e. mov rsp, rstk or mov<br>+ ; rsp, [rsp+stack_size_padded])<br>+ %if %1 < 0 ; need to store rsp on stack<br>+ %xdefine rstkm [rsp + stack_size + %%pad]<br>+ %assign %%pad %%pad + gprsize<br>+ %else ; can keep rsp in rstk during whole function<br>+ %xdefine rstkm rstk<br>+ %endif<br>+ %assign stack_size_padded stack_size + ((%%pad + required_stack_alignment-1) & ~(required_stack_alignment-1))<br>+ mov rstk, rsp<br>+ and rsp, ~(required_stack_alignment-1)<br>+ sub rsp, stack_size_padded<br>+ movifnidn rstkm, rstk<br>+ %endif<br>+ WIN64_PUSH_XMM<br>+ %endif<br>+ %endif<br>+%endmacro<br>+<br>+%macro SETUP_STACK_POINTER 1<br>+ %ifnum %1<br>+ %if %1 != 0 && required_stack_alignment > STACK_ALIGNMENT<br>+ %if %1 > 0<br>+ ; Reserve an additional register for storing the original stack pointer, but avoid using<br>+ ; eax/rax for this purpose since it can potentially get overwritten as a return value.<br>+ %assign regs_used (regs_used + 1)<br>+ %if ARCH_X86_64 && regs_used == 7<br>+ %assign regs_used 8<br>+ %elif ARCH_X86_64 == 0 && regs_used == 1<br>+ %assign regs_used 2<br>+ %endif<br>+ %endif<br>+ %if ARCH_X86_64 && regs_used < 5 + UNIX64 * 3<br>+ ; Ensure that we don't clobber any registers containing arguments. For UNIX64 we also preserve r6 (rax)<br>+ ; since it's used as a hidden argument in vararg functions to specify the number of vector registers used.<br>+ %assign regs_used 5 + UNIX64 * 3<br>+ %endif<br>+ %endif<br>+ %endif<br>+%endmacro<br>+<br>+%macro DEFINE_ARGS_INTERNAL 3+<br>+ %ifnum %2<br>+ DEFINE_ARGS %3<br>+ %elif %1 == 4<br>+ DEFINE_ARGS %2<br>+ %elif %1 > 4<br>+ DEFINE_ARGS %2, %3<br>+ %endif<br>+%endmacro<br>+<br>+%if WIN64 ; Windows x64 ;=================================================<br>+<br>+DECLARE_REG 0, rcx<br>+DECLARE_REG 1, rdx<br>+DECLARE_REG 2, R8<br>+DECLARE_REG 3, R9<br>+DECLARE_REG 4, R10, 40<br>+DECLARE_REG 5, R11, 48<br>+DECLARE_REG 6, rax, 56<br>+DECLARE_REG 7, rdi, 64<br>+DECLARE_REG 8, rsi, 72<br>+DECLARE_REG 9, rbx, 80<br>+DECLARE_REG 10, rbp, 88<br>+DECLARE_REG 11, R14, 96<br>+DECLARE_REG 12, R15, 104<br>+DECLARE_REG 13, R12, 112<br>+DECLARE_REG 14, R13, 120<br>+<br>+%macro PROLOGUE 2-5+ 0 ; #args, #regs, #xmm_regs, [stack_size,] arg_names...<br>+ %assign num_args %1<br>+ %assign regs_used %2<br>+ ASSERT regs_used >= num_args<br>+ SETUP_STACK_POINTER %4<br>+ ASSERT regs_used <= 15<br>+ PUSH_IF_USED 7, 8, 9, 10, 11, 12, 13, 14<br>+ ALLOC_STACK %4, %3<br>+ %if mmsize != 8 && stack_size == 0<br>+ WIN64_SPILL_XMM %3<br>+ %endif<br>+ LOAD_IF_USED 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14<br>+ DEFINE_ARGS_INTERNAL %0, %4, %5<br>+%endmacro<br>+<br>+%macro WIN64_PUSH_XMM 0<br>+ ; Use the shadow space to store XMM6 and XMM7, the rest needs stack space allocated.<br>+ %if xmm_regs_used > 6 + high_mm_regs<br>+ movaps [rstk + stack_offset + 8], xmm6<br>+ %endif<br>+ %if xmm_regs_used > 7 + high_mm_regs<br>+ movaps [rstk + stack_offset + 24], xmm7<br>+ %endif<br>+ %assign %%xmm_regs_on_stack xmm_regs_used - high_mm_regs - 8<br>+ %if %%xmm_regs_on_stack > 0<br>+ %assign %%i 8<br>+ %rep %%xmm_regs_on_stack<br>+ movaps [rsp + (%%i-8)*16 + stack_size + 32], xmm %+ %%i<br>+ %assign %%i %%i+1<br>+ %endrep<br>+ %endif<br>+%endmacro<br>+<br>+%macro WIN64_SPILL_XMM 1<br>+ %assign xmm_regs_used %1<br>+ ASSERT xmm_regs_used <= 16 + high_mm_regs<br>+ %assign %%xmm_regs_on_stack xmm_regs_used - high_mm_regs - 8<br>+ %if %%xmm_regs_on_stack > 0<br>+ ; Allocate stack space for callee-saved xmm registers plus shadow space and align the stack.<br>+ %assign %%pad %%xmm_regs_on_stack*16 + 32<br>+ %assign stack_size_padded %%pad + ((-%%pad-stack_offset-gprsize) & (STACK_ALIGNMENT-1))<br>+ SUB rsp, stack_size_padded<br>+ %endif<br>+ WIN64_PUSH_XMM<br>+%endmacro<br>+<br>+%macro WIN64_RESTORE_XMM_INTERNAL 0<br>+ %assign %%pad_size 0<br>+ %assign %%xmm_regs_on_stack xmm_regs_used - high_mm_regs - 8<br>+ %if %%xmm_regs_on_stack > 0<br>+ %assign %%i xmm_regs_used - high_mm_regs<br>+ %rep %%xmm_regs_on_stack<br>+ %assign %%i %%i-1<br>+ movaps xmm %+ %%i, [rsp + (%%i-8)*16 + stack_size + 32]<br>+ %endrep<br>+ %endif<br>+ %if stack_size_padded > 0<br>+ %if stack_size > 0 && required_stack_alignment > STACK_ALIGNMENT<br>+ mov rsp, rstkm<br>+ %else<br>+ add rsp, stack_size_padded<br>+ %assign %%pad_size stack_size_padded<br>+ %endif<br>+ %endif<br>+ %if xmm_regs_used > 7 + high_mm_regs<br>+ movaps xmm7, [rsp + stack_offset - %%pad_size + 24]<br>+ %endif<br>+ %if xmm_regs_used > 6 + high_mm_regs<br>+ movaps xmm6, [rsp + stack_offset - %%pad_size + 8]<br>+ %endif<br>+%endmacro<br>+<br>+%macro WIN64_RESTORE_XMM 0<br>+ WIN64_RESTORE_XMM_INTERNAL<br>+ %assign stack_offset (stack_offset-stack_size_padded)<br>+ %assign stack_size_padded 0<br>+ %assign xmm_regs_used 0<br>+%endmacro<br>+<br>+%define has_epilogue regs_used > 7 || stack_size > 0 || vzeroupper_required || xmm_regs_used > 6+high_mm_regs<br>+<br>+%macro RET 0<br>+ WIN64_RESTORE_XMM_INTERNAL<br>+ POP_IF_USED 14, 13, 12, 11, 10, 9, 8, 7<br>+ %if vzeroupper_required<br>+ vzeroupper<br>+ %endif<br>+ AUTO_REP_RET<br>+%endmacro<br>+<br>+%elif ARCH_X86_64 ; *nix x64 ;=============================================<br>+<br>+DECLARE_REG 0, rdi<br>+DECLARE_REG 1, rsi<br>+DECLARE_REG 2, rdx<br>+DECLARE_REG 3, rcx<br>+DECLARE_REG 4, R8<br>+DECLARE_REG 5, R9<br>+DECLARE_REG 6, rax, 8<br>+DECLARE_REG 7, R10, 16<br>+DECLARE_REG 8, R11, 24<br>+DECLARE_REG 9, rbx, 32<br>+DECLARE_REG 10, rbp, 40<br>+DECLARE_REG 11, R14, 48<br>+DECLARE_REG 12, R15, 56<br>+DECLARE_REG 13, R12, 64<br>+DECLARE_REG 14, R13, 72<br>+<br>+%macro PROLOGUE 2-5+ 0 ; #args, #regs, #xmm_regs, [stack_size,] arg_names...<br>+ %assign num_args %1<br>+ %assign regs_used %2<br>+ %assign xmm_regs_used %3<br>+ ASSERT regs_used >= num_args<br>+ SETUP_STACK_POINTER %4<br>+ ASSERT regs_used <= 15<br>+ PUSH_IF_USED 9, 10, 11, 12, 13, 14<br>+ ALLOC_STACK %4<br>+ LOAD_IF_USED 6, 7, 8, 9, 10, 11, 12, 13, 14<br>+ DEFINE_ARGS_INTERNAL %0, %4, %5<br>+%endmacro<br>+<br>+%define has_epilogue regs_used > 9 || stack_size > 0 || vzeroupper_required<br>+<br>+%macro RET 0<br>+ %if stack_size_padded > 0<br>+ %if required_stack_alignment > STACK_ALIGNMENT<br>+ mov rsp, rstkm<br>+ %else<br>+ add rsp, stack_size_padded<br>+ %endif<br>+ %endif<br>+ POP_IF_USED 14, 13, 12, 11, 10, 9<br>+ %if vzeroupper_required<br>+ vzeroupper<br>+ %endif<br>+ AUTO_REP_RET<br>+%endmacro<br>+<br>+%else ; X86_32 ;==============================================================<br>+<br>+DECLARE_REG 0, eax, 4<br>+DECLARE_REG 1, ecx, 8<br>+DECLARE_REG 2, edx, 12<br>+DECLARE_REG 3, ebx, 16<br>+DECLARE_REG 4, esi, 20<br>+DECLARE_REG 5, edi, 24<br>+DECLARE_REG 6, ebp, 28<br>+%define rsp esp<br>+<br>+%macro DECLARE_ARG 1-*<br>+ %rep %0<br>+ %define r%1m [rstk + stack_offset + 4*%1 + 4]<br>+ %define r%1mp dword r%1m<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14<br>+<br>+%macro PROLOGUE 2-5+ ; #args, #regs, #xmm_regs, [stack_size,] arg_names...<br>+ %assign num_args %1<br>+ %assign regs_used %2<br>+ ASSERT regs_used >= num_args<br>+ %if num_args > 7<br>+ %assign num_args 7<br>+ %endif<br>+ %if regs_used > 7<br>+ %assign regs_used 7<br>+ %endif<br>+ SETUP_STACK_POINTER %4<br>+ ASSERT regs_used <= 7<br>+ PUSH_IF_USED 3, 4, 5, 6<br>+ ALLOC_STACK %4<br>+ LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6<br>+ DEFINE_ARGS_INTERNAL %0, %4, %5<br>+%endmacro<br>+<br>+%define has_epilogue regs_used > 3 || stack_size > 0 || vzeroupper_required<br>+<br>+%macro RET 0<br>+ %if stack_size_padded > 0<br>+ %if required_stack_alignment > STACK_ALIGNMENT<br>+ mov rsp, rstkm<br>+ %else<br>+ add rsp, stack_size_padded<br>+ %endif<br>+ %endif<br>+ POP_IF_USED 6, 5, 4, 3<br>+ %if vzeroupper_required<br>+ vzeroupper<br>+ %endif<br>+ AUTO_REP_RET<br>+%endmacro<br>+<br>+%endif ;======================================================================<br>+<br>+%if WIN64 == 0<br>+ %macro WIN64_SPILL_XMM 1<br>+ %endmacro<br>+ %macro WIN64_RESTORE_XMM 0<br>+ %endmacro<br>+ %macro WIN64_PUSH_XMM 0<br>+ %endmacro<br>+%endif<br>+<br>+; On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either<br>+; a branch or a branch target. So switch to a 2-byte form of ret in that case.<br>+; We can automatically detect "follows a branch", but not a branch target.<br>+; (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)<br>+%macro REP_RET 0<br>+ %if has_epilogue || cpuflag(ssse3)<br>+ RET<br>+ %else<br>+ rep ret<br>+ %endif<br>+ annotate_function_size<br>+%endmacro<br>+<br>+%define last_branch_adr $$<br>+%macro AUTO_REP_RET 0<br>+ %if notcpuflag(ssse3)<br>+ times ((last_branch_adr-$)>>31)+1 rep ; times 1 iff $ == last_branch_adr.<br>+ %endif<br>+ ret<br>+ annotate_function_size<br>+%endmacro<br>+<br>+%macro BRANCH_INSTR 0-*<br>+ %rep %0<br>+ %macro %1 1-2 %1<br>+ %2 %1<br>+ %if notcpuflag(ssse3)<br>+ %%branch_instr equ $<br>+ %xdefine last_branch_adr %%branch_instr<br>+ %endif<br>+ %endmacro<br>+ %rotate 1<br>+ %endrep<br>+%endmacro<br>+<br>+BRANCH_INSTR jz, je, jnz, jne, jl, jle, jnl, jnle, jg, jge, jng, jnge, ja, jae, jna, jnae, jb, jbe, jnb, jnbe, jc, jnc, js, jns, jo, jno, jp, jnp<br>+<br>+%macro TAIL_CALL 1-2 1 ; callee, is_nonadjacent<br>+ %if has_epilogue<br>+ call %1<br>+ RET<br>+ %elif %2<br>+ jmp %1<br>+ %endif<br>+ annotate_function_size<br>+%endmacro<br>+<br>+;=============================================================================<br>+; arch-independent part<br>+;=============================================================================<br>+<br>+%assign function_align 16<br>+<br>+; B</pre></blockquote></div><br>-- <br>Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.</body></html>