[x264-devel] Re: [Patch] x264 on ppc without altivec

Guillaume POIRIER poirierg at gmail.com
Wed Mar 14 22:12:00 CET 2007


Hi,

On 3/14/07, Sam Hocevar <sam at zoy.org> wrote:
> On Mon, Mar 05, 2007, Alexis Ballier wrote:
>
> > But then, I don't understand why the code works in a simple program and
> > not in x264.
> >
> > by the way :
> > $gcc -maltivec -mabi=altivec toto.c
> > $./a.out
> > 0
> > $
>
>    It's because your toto.c does not cause gcc to generate AltiVec code,
> but it may happen (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=408918).
>
>    I added proper runtime AltiVec detection, as well as a fix for the
> makefile so that only ppc/*.c files get compiled with the AltiVec flags,
> can you give the current trunk a try?

I tried on POWER5 with very latest SVN (r630):
 ./configure switches are: --enable-debug --disable-pthread

gdb ./tools/checkasm
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) r
Starting program: /home/contest/gpoirier/x264/tools/checkasm
x264: using random seed 1712781642
x264: ALTIVEC against C

Program received signal SIGILL, Illegal instruction.
pixel_sad_16x16_altivec (
    pix1=0x10036020
"\202p?~\222??\234?\034?\216\016???J\003?\b?:@<?=.\210#?\001\"|D??R\001?\233\2253????\fF???\t_?H?1?\221\235O\207\021?Y\2272Z??p?$w\022A[?\227nX
?\033\222D\023?\206?0???Y~m<\031e\035??gn?\202*\217\a\213\233#\f/?\034}?
9??4DO??? F>?=??\\\224?\231\003\016ҧ?\204?T~????\177U?", i_pix1=32,
    pix2=0x10036450
"\210¯\016L?:??5??Ws?@\036?\201`\006\216?\bX??<??]K%ILs?\223=\212?ûǯ??\210\213\226{*\n\031X??֠\214F\\?\001?0?\221\027?M?q?\025a?m??\006\212{Q?\002\0257\033\230.\214?\216g?\236\024X??0RET\236?y#?\205??=rM\227??X??!f;?????\tb?\213^??3?????\223???r\221\036?L\214?\030X\001??\203?[?*?IAd\024\"\v,\227?d???\r?n\202?|?\036\032\026I\vb?/?\033\025)8~?"...,
i_pix2=16) at common/ppc/pixel.c:64
64      PIXEL_SAD_ALTIVEC( pixel_sad_16x16_altivec, 16, 16, s,  3 )
(gdb) bt
#0  pixel_sad_16x16_altivec (
    pix1=0x10036020
"\202p?~\222??\234?\034?\216\016???J\003?\b?:@<?=.\210#?\001\"|D??R\001?\233\2253????\fF???\t_?H?1?\221\235O\207\021?Y\2272Z??p?$w\022A[?\227nX
?\033\222D\023?\206?0???Y~m<\031e\035??gn?\202*\217\a\213\233#\f/?\034}?
9??4DO??? F>?=??\\\224?\231\003\016ҧ?\204?T~????\177U?", i_pix1=32,
    pix2=0x10036450
"\210¯\016L?:??5??Ws?@\036?\201`\006\216?\bX??<??]K%ILs?\223=\212?ûǯ??\210\213\226{*\n\031X??֠\214F\\?\001?0?\221\027?M?q?\025a?m??\006\212{Q?\002\0257\033\230.\214?\216g?\236\024X??0RET\236?y#?\205??=rM\227??X??!f;?????\tb?\213^??3?????\223???r\221\036?L\214?\030X\001??\203?[?*?IAd\024\"\v,\227?d???\r?n\202?|?\036\032\026I\vb?/?\033\025)8~?"...,
i_pix2=16) at common/ppc/pixel.c:64
#1  0x10003da4 in check_all (cpu_ref=0, cpu_new=64) at tools/checkasm.c:71
#2  0x10005e18 in main (argc=<value optimized out>, argv=<value
optimized out>) at tools/checkasm.c:773
(gdb) disassemble $pc-32,$pc+32
Dump of assembler code for function pixel_sad_16x16_altivec:
0x1001a104 <pixel_sad_16x16_altivec+0>: stwu    r1,-32(r1)
0x1001a108 <pixel_sad_16x16_altivec+4>: vspltisb v10,0
0x1001a10c <pixel_sad_16x16_altivec+8>: vor     v11,v10,v10
0x1001a110 <pixel_sad_16x16_altivec+12>:        li      r0,16
0x1001a114 <pixel_sad_16x16_altivec+16>:        mtctr   r0
0x1001a118 <pixel_sad_16x16_altivec+20>:        lvx     v13,r0,r3
0x1001a11c <pixel_sad_16x16_altivec+24>:        lvsl    v1,r0,r3
0x1001a120 <pixel_sad_16x16_altivec+28>:        addi    r9,r3,15
0x1001a124 <pixel_sad_16x16_altivec+32>:        lvx     v0,r0,r9
0x1001a128 <pixel_sad_16x16_altivec+36>:        vperm   v13,v13,v0,v1
0x1001a12c <pixel_sad_16x16_altivec+40>:        lvx     v1,r0,r5
0x1001a130 <pixel_sad_16x16_altivec+44>:        lvsl    v12,r0,r5
0x1001a134 <pixel_sad_16x16_altivec+48>:        addi    r9,r5,15
0x1001a138 <pixel_sad_16x16_altivec+52>:        lvx     v0,r0,r9
0x1001a13c <pixel_sad_16x16_altivec+56>:        vperm   v1,v1,v0,v12
0x1001a140 <pixel_sad_16x16_altivec+60>:        vmaxub  v0,v13,v1
0x1001a144 <pixel_sad_16x16_altivec+64>:        vminub  v13,v13,v1
0x1001a148 <pixel_sad_16x16_altivec+68>:        vsububm v0,v0,v13
0x1001a14c <pixel_sad_16x16_altivec+72>:        vsum4ubs v11,v0,v11
0x1001a150 <pixel_sad_16x16_altivec+76>:        add     r3,r3,r4
0x1001a154 <pixel_sad_16x16_altivec+80>:        add     r5,r5,r6
0x1001a158 <pixel_sad_16x16_altivec+84>:        bdnz+   0x1001a118
<pixel_sad_16x16_altivec+20>
0x1001a15c <pixel_sad_16x16_altivec+88>:        vsumsws v0,v11,v10
0x1001a160 <pixel_sad_16x16_altivec+92>:        vspltw  v0,v0,3
0x1001a164 <pixel_sad_16x16_altivec+96>:        addi    r9,r1,16
0x1001a168 <pixel_sad_16x16_altivec+100>:       stvewx  v0,r0,r9
0x1001a16c <pixel_sad_16x16_altivec+104>:       lwz     r3,16(r1)
0x1001a170 <pixel_sad_16x16_altivec+108>:       addi    r1,r1,32
0x1001a174 <pixel_sad_16x16_altivec+112>:       blr
End of assembler dump.
(gdb)


It still looks Altivec code is enabled.... bummer!

Guillaume


More information about the x264-devel mailing list