[x264-devel] Windows x64 support
BugMaster
BugMaster at narod.ru
Thu Dec 18 07:55:50 CET 2008
On Thu, 18 Dec 2008 00:34:51 +0000 (UTC), Loren Merritt wrote:
> On Thu, 18 Dec 2008, BugMaster wrote:
> what's the _ doing in _WIN64 ?
>> @@ -189,6 +197,9 @@ cglobal x264_add8x8_idct_sse2, 2,2
>> %macro SUB_NxN_DCT 6
>> cglobal %1, 3,3
>> .skip_prologue:
>> +%ifdef _WIN64
>> + sub rsp, 8
>> +%endif
>> call %2
>> add r0, %3
>> add r1, %4-%5-%6*FENC_STRIDE
>> @@ -201,7 +212,13 @@ cglobal %1, 3,3
>> add r0, %3
>> add r1, %4-%5-%6*FENC_STRIDE
>> add r2, %4-%5-%6*FDEC_STRIDE
>> +%ifdef _WIN64
>> + call %2
>> + add rsp, 8
>> + RET
>> +%else
>> jmp %2
>> +%endif
>> %endmacro
> can't you keep the jmp here? there's no epilogue.
Called function %2 need 16 byte aligned stack. But I think in this
situation it would be correct to keep:
+%ifdef _WIN64
+ add rsp, 8
+%endif
jmp %2
>> --- a/common/x86/mc-a.asm
>> +++ b/common/x86/mc-a.asm
>> @@ -49,9 +49,15 @@ SECTION .text
>> %define t5 r5
>> %define t6d r10d
>> %define t7d r11d
>> - %macro AVG_START 0
>> - PROLOGUE 6,7
>> - .height_loop:
>> + %macro AVG_START 0-1
>> + %if %0 > 0
>> + PROLOGUE 6,7,%1
>> + %else
>> + PROLOGUE 6,7
>> + %endif
> %macro AVG_START 0-1 0
> PROLOGUE 6,7,%1
OK. Earlier I have diffent code (for testing purpose) in PROLOGUE and this don't work.
>> @@ -719,8 +758,8 @@ cglobal x264_prefetch_ref_mmxext, 3,3
>> ; int dx, int dy,
>> ; int width, int height )
>> ;-----------------------------------------------------------------------------
>> -%macro MC_CHROMA 1
>> -cglobal x264_mc_chroma_%1, 0,6
>> +%macro MC_CHROMA 2
>> +cglobal x264_mc_chroma_%1, 0,6,%2
>> %if mmsize == 16
>> cmp dword r6m, 4
>> jle x264_mc_chroma_mmxext %+ .skip_prologue
>> @@ -877,12 +916,12 @@ cglobal x264_mc_chroma_%1, 0,6
>> %endmacro ; MC_CHROMA
>>
>> INIT_MMX
>> -MC_CHROMA mmxext
>> +MC_CHROMA mmxext, 8
> 0
No, because SSE2 version jump to it and so they must have the same
epilogue.
>> +x264_dequant_%2x%2_%1.skip_prologue:
>> [...]
>> + jl x264_dequant_%2x%2_%1.skip_prologue
> .skip_prologue:
> [...]
> jl x264_dequant_%2x%2_%1 %+ .skip_prologue
It wouldn't compile with defined PREFIX because yasm for some reason
will not replace x264_dequant_%2x%2_%1 with _x264_dequant_%2x%2_%1.
>> @@ -539,9 +539,16 @@ INTRA_SAD16 ssse3
>>
>> %macro SAD_X3_END 0
>> %ifdef ARCH_X86_64
>> +%ifdef _WIN64
>> + mov r0, r5m
>> + movd [r0+0], mm0
>> + movd [r0+4], mm1
>> + movd [r0+8], mm2
>> +%else
>> movd [r5+0], mm0
>> movd [r5+4], mm1
>> movd [r5+8], mm2
>> +%endif
>> %else
>> mov r0, r5m
>> movd [r0+0], mm0
> %ifdef WIN64
> %elifdef ARCH_X86_64
> %else
OK. If it would work same.
>> +%ifdef _WIN64
>> + movsxd r5, r5d
>> + sub rsp, 24
>> + mov [rsp], r2
>> + mov [rsp+8], r3
>> + mov [rsp+16], r4
>> +%else
>> push r4
>> push r3
>> push r2
>> +%endif
> what's the difference?
Probably non (if we keep movsxd). I simply copy/past from sad_x3
version.
>> +++ b/common/x86/x86inc.asm
> consistent 4 space indent, please
>> + %if %0 > 2
>> + %assign xmm_regs_used %3
>> + %else
>> + %assign xmm_regs_used 0
>> + %endif
> yasm has a syntax for default args.
Now, I know this ;)
>> @@ -253,6 +253,8 @@ case $host_cpu in
>> ASFLAGS="-f macho64 -m amd64 -DPIC -DPREFIX"
>> CFLAGS="$CFLAGS -arch x86_64"
>> LDFLAGS="$LDFLAGS -arch x86_64"
>> + elif [ "$SYS" = MINGW ]; then
>> + ASFLAGS="-f win64 -m amd64 -DPREFIX -D_WIN64"
>> else
>> ASFLAGS="-f elf -m amd64"
>> fi
> I prefer ffmpeg's version, which is
> ASFLAGS="-f win32 -m amd64 -DPREFIX"
> [...]
> %ifdef ARCH_X86_64
> %ifidn __OUTPUT_FORMAT__,win32
> %define WIN64
Hm. Interesting solution. But would it be correct to use "-f win32"? Are
win32 and win64 identical object formats?
> I would fix these myself, except that I'm not about to apply a patch
> without someone testing the final version.
> --Loren Merritt
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org
> http://mailman.videolan.org/listinfo/x264-devel
More information about the x264-devel
mailing list