[x264-devel] Re: [patch] AMD64 support for x264 codec

Loren Merritt lorenm at u.washington.edu
Tue Apr 19 10:37:41 CEST 2005


On Tue, 19 Apr 2005, Josef Zlomek wrote:

> I have adapted the i386 assembler code of x264 codec to amd64 architecture.
> I have changed the calling convention according to AMD64 ABI
> (http://www.x86-64.org/documentation/abi-0.95.pdf) and made several minor
> optimizations because amd64 has more registers.
> I have left the MMX code intact, it might be possible to rewrite it to
> SSE using 16 registers to improve speed of the codec.
> Some code (like CPU features detection) might be useless.
>
> The patch is available at http://zlomek.jikos.cz/x264-amd64.patch
> The assembler code can be compiled by yasm 0.4.0.

Thanks.
I had to add "-DHAVE_MMXEXT -DHAVE_SSE2" to CFLAGS, otherwise the patch 
has no effect.
It works nicely with gcc -O1, but if I set -O2 or higher I get a segfault 
(gdb log attached). I have no idea what's causing this; the log seems 
impossible to me.

--Loren Merritt
-------------- next part --------------
$ gdb ~/src/x264/x264
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-linux"...Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) break me.c:243
Breakpoint 1 at 0x421f1d: file encoder/me.c, line 243.
(gdb) r foreman_176x144.yuv -o b.264 
Starting program: /home/loren/src/x264/x264 foreman_176x144.yuv -o b.264
x264: file name gives 176x144
x264 [info]: using cpu capabilities MMX MMXEXT SSE SSE2 3DNow! 

Breakpoint 1, refine_subpel (h=0x2a95a33010, m=0x7fbfffc420, hpel_iters=1, qpel_iters=2) at encoder/me.c:243
243                 int bdir = 0;
(gdb) s
244                 COST_MV( bmx, bmy - step, 0 );
(gdb) 
get_ref_mmx (src=0x7fbfffc430, i_src_stride=-1073757904, dst=0x7fbfffc420 "", i_dst_stride=0x7fbfffc280, 
    mvx=8, mvy=2, i_width=0, i_height=-1073757152) at common/amd64/mc-c.c:1089
1089    {
(gdb) up
#1  0x0000000000421f91 in refine_subpel (h=0x2a95a33010, m=0x7fbfffc420, hpel_iters=1, qpel_iters=2)
    at encoder/me.c:244
244                 COST_MV( bmx, bmy - step, 0 );
(gdb) p &m->p_fref 
$1 = (uint8_t *(*)[6]) 0x7fbfffc430
(gdb) p m->i_stride[0]
$2 = 240
(gdb) p &pix
$3 = (uint8_t (*)[256]) 0x7fbfffc130
(gdb) p &stride
$4 = (int *) 0x7fbfffc114
(gdb) p bw
$5 = 16
(gdb) p bh
$6 = 16
(gdb) c
Continuing.

Breakpoint 1, refine_subpel (h=0x2a95a33010, m=0x7fbfffc420, hpel_iters=1, qpel_iters=2) at encoder/me.c:243
243                 int bdir = 0;
(gdb) s
244                 COST_MV( bmx, bmy - step, 0 );
(gdb) 
get_ref_mmx (src=0x7fbfffc430, i_src_stride=-1073757904, dst=0x7fbfffc420 "", i_dst_stride=0x7fbfffc280, 
    mvx=6, mvy=3, i_width=0, i_height=-1073757152) at common/amd64/mc-c.c:1089
1089    {
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000438806 in x264_pixel_satd_16x16_mmxext ()
(gdb) bt
#0  0x0000000000438806 in x264_pixel_satd_16x16_mmxext ()
#1  0x0000007fbfffc280 in ?? ()
#2  0x0000007fbfffc430 in ?? ()
#3  0x0000000000421fcf in refine_subpel (h=Cannot access memory at address 0xffffffffffffffe8
) at encoder/me.c:244
Previous frame inner to this frame (corrupt stack?)
(gdb)


More information about the x264-devel mailing list