[x264-devel] Windows x64 support

BugMaster BugMaster at narod.ru
Thu Dec 18 07:55:50 CET 2008


On Thu, 18 Dec 2008 00:34:51 +0000 (UTC), Loren Merritt wrote:
> On Thu, 18 Dec 2008, BugMaster wrote:

> what's the _ doing in _WIN64 ?

>> @@ -189,6 +197,9 @@ cglobal x264_add8x8_idct_sse2, 2,2
>>  %macro SUB_NxN_DCT 6
>>  cglobal %1, 3,3
>>  .skip_prologue:
>> +%ifdef _WIN64
>> +    sub  rsp, 8
>> +%endif
>>      call %2
>>      add  r0, %3
>>      add  r1, %4-%5-%6*FENC_STRIDE
>> @@ -201,7 +212,13 @@ cglobal %1, 3,3
>>      add  r0, %3
>>      add  r1, %4-%5-%6*FENC_STRIDE
>>      add  r2, %4-%5-%6*FDEC_STRIDE
>> +%ifdef _WIN64
>> +    call %2
>> +    add  rsp, 8
>> +    RET
>> +%else
>>      jmp  %2
>> +%endif
>>  %endmacro

> can't you keep the jmp here? there's no epilogue.

Called function %2 need 16 byte aligned stack. But I think in this
situation it would be correct to keep:

+%ifdef _WIN64
+    add  rsp, 8
+%endif
     jmp  %2

>> --- a/common/x86/mc-a.asm
>> +++ b/common/x86/mc-a.asm
>> @@ -49,9 +49,15 @@ SECTION .text
>>      %define t5 r5
>>      %define t6d r10d
>>      %define t7d r11d
>> -    %macro AVG_START 0
>> -        PROLOGUE 6,7
>> -        .height_loop:
>> +    %macro AVG_START 0-1
>> +        %if %0 > 0
>> +            PROLOGUE 6,7,%1
>> +        %else
>> +            PROLOGUE 6,7
>> +        %endif

> %macro AVG_START 0-1 0
>      PROLOGUE 6,7,%1

OK. Earlier I have diffent code (for testing purpose) in PROLOGUE and this don't work.

>> @@ -719,8 +758,8 @@ cglobal x264_prefetch_ref_mmxext, 3,3
>>  ;                             int dx, int dy,
>>  ;                             int width, int height )
>>  ;-----------------------------------------------------------------------------
>> -%macro MC_CHROMA 1
>> -cglobal x264_mc_chroma_%1, 0,6
>> +%macro MC_CHROMA 2
>> +cglobal x264_mc_chroma_%1, 0,6,%2
>>  %if mmsize == 16
>>      cmp dword r6m, 4
>>      jle x264_mc_chroma_mmxext %+ .skip_prologue
>> @@ -877,12 +916,12 @@ cglobal x264_mc_chroma_%1, 0,6
>>  %endmacro ; MC_CHROMA
>>
>>  INIT_MMX
>> -MC_CHROMA mmxext
>> +MC_CHROMA mmxext, 8

> 0

No, because SSE2 version jump to it and so they must have the same
epilogue.

>> +x264_dequant_%2x%2_%1.skip_prologue:
>> [...]
>> +    jl x264_dequant_%2x%2_%1.skip_prologue

> .skip_prologue:
> [...]
>      jl x264_dequant_%2x%2_%1 %+ .skip_prologue

It wouldn't compile with defined PREFIX because yasm for some reason
will not replace x264_dequant_%2x%2_%1 with _x264_dequant_%2x%2_%1.

>> @@ -539,9 +539,16 @@ INTRA_SAD16 ssse3
>>
>>  %macro SAD_X3_END 0
>>  %ifdef ARCH_X86_64
>> +%ifdef _WIN64
>> +    mov     r0, r5m
>> +    movd    [r0+0], mm0
>> +    movd    [r0+4], mm1
>> +    movd    [r0+8], mm2
>> +%else
>>      movd    [r5+0], mm0
>>      movd    [r5+4], mm1
>>      movd    [r5+8], mm2
>> +%endif
>>  %else
>>      mov     r0, r5m
>>      movd    [r0+0], mm0

> %ifdef WIN64
> %elifdef ARCH_X86_64
> %else

OK. If it would work same.

>> +%ifdef _WIN64
>> +    movsxd r5, r5d
>> +    sub  rsp, 24
>> +    mov  [rsp],    r2
>> +    mov  [rsp+8],  r3
>> +    mov  [rsp+16], r4
>> +%else
>>      push r4
>>      push r3
>>      push r2
>> +%endif

> what's the difference?

Probably non (if we keep movsxd). I simply copy/past from sad_x3
version.

>> +++ b/common/x86/x86inc.asm

> consistent 4 space indent, please

>> +    %if %0 > 2
>> +      %assign xmm_regs_used %3
>> +    %else
>> +      %assign xmm_regs_used 0
>> +    %endif

> yasm has a syntax for default args.

Now, I know this ;)

>> @@ -253,6 +253,8 @@ case $host_cpu in
>>        ASFLAGS="-f macho64 -m amd64 -DPIC -DPREFIX"
>>        CFLAGS="$CFLAGS -arch x86_64"
>>        LDFLAGS="$LDFLAGS -arch x86_64"
>> +    elif [ "$SYS" = MINGW ]; then
>> +      ASFLAGS="-f win64 -m amd64 -DPREFIX -D_WIN64"
>>      else
>>        ASFLAGS="-f elf -m amd64"
>>      fi

> I prefer ffmpeg's version, which is
> ASFLAGS="-f win32 -m amd64 -DPREFIX"
> [...]
> %ifdef ARCH_X86_64
> %ifidn __OUTPUT_FORMAT__,win32
> %define WIN64

Hm. Interesting solution. But would it be correct to use "-f win32"? Are
win32 and win64 identical object formats?

> I would fix these myself, except that I'm not about to apply a patch 
> without someone testing the final version.

> --Loren Merritt
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org
> http://mailman.videolan.org/listinfo/x264-devel




More information about the x264-devel mailing list