[x264-devel] Re: [Patch] x264 on ppc without altivec
Guillaume POIRIER
poirierg at gmail.com
Wed Mar 14 22:12:00 CET 2007
Hi,
On 3/14/07, Sam Hocevar <sam at zoy.org> wrote:
> On Mon, Mar 05, 2007, Alexis Ballier wrote:
>
> > But then, I don't understand why the code works in a simple program and
> > not in x264.
> >
> > by the way :
> > $gcc -maltivec -mabi=altivec toto.c
> > $./a.out
> > 0
> > $
>
> It's because your toto.c does not cause gcc to generate AltiVec code,
> but it may happen (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=408918).
>
> I added proper runtime AltiVec detection, as well as a fix for the
> makefile so that only ppc/*.c files get compiled with the AltiVec flags,
> can you give the current trunk a try?
I tried on POWER5 with very latest SVN (r630):
./configure switches are: --enable-debug --disable-pthread
gdb ./tools/checkasm
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".
(gdb) r
Starting program: /home/contest/gpoirier/x264/tools/checkasm
x264: using random seed 1712781642
x264: ALTIVEC against C
Program received signal SIGILL, Illegal instruction.
pixel_sad_16x16_altivec (
pix1=0x10036020
"\202p?~\222??\234?\034?\216\016???J\003?\b?:@<?=.\210#?\001\"|D??R\001?\233\2253????\fF???\t_?H?1?\221\235O\207\021?Y\2272Z??p?$w\022A[?\227nX
?\033\222D\023?\206?0???Y~m<\031e\035??gn?\202*\217\a\213\233#\f/?\034}?
9??4DO??? F>?=??\\\224?\231\003\016ҧ?\204?T~????\177U?", i_pix1=32,
pix2=0x10036450
"\210¯\016L?:??5??Ws?@\036?\201`\006\216?\bX??<??]K%ILs?\223=\212?ûǯ??\210\213\226{*\n\031X??֠\214F\\?\001?0?\221\027?M?q?\025a?m??\006\212{Q?\002\0257\033\230.\214?\216g?\236\024X??0RET\236?y#?\205??=rM\227??X??!f;?????\tb?\213^??3?????\223???r\221\036?L\214?\030X\001??\203?[?*?IAd\024\"\v,\227?d???\r?n\202?|?\036\032\026I\vb?/?\033\025)8~?"...,
i_pix2=16) at common/ppc/pixel.c:64
64 PIXEL_SAD_ALTIVEC( pixel_sad_16x16_altivec, 16, 16, s, 3 )
(gdb) bt
#0 pixel_sad_16x16_altivec (
pix1=0x10036020
"\202p?~\222??\234?\034?\216\016???J\003?\b?:@<?=.\210#?\001\"|D??R\001?\233\2253????\fF???\t_?H?1?\221\235O\207\021?Y\2272Z??p?$w\022A[?\227nX
?\033\222D\023?\206?0???Y~m<\031e\035??gn?\202*\217\a\213\233#\f/?\034}?
9??4DO??? F>?=??\\\224?\231\003\016ҧ?\204?T~????\177U?", i_pix1=32,
pix2=0x10036450
"\210¯\016L?:??5??Ws?@\036?\201`\006\216?\bX??<??]K%ILs?\223=\212?ûǯ??\210\213\226{*\n\031X??֠\214F\\?\001?0?\221\027?M?q?\025a?m??\006\212{Q?\002\0257\033\230.\214?\216g?\236\024X??0RET\236?y#?\205??=rM\227??X??!f;?????\tb?\213^??3?????\223???r\221\036?L\214?\030X\001??\203?[?*?IAd\024\"\v,\227?d???\r?n\202?|?\036\032\026I\vb?/?\033\025)8~?"...,
i_pix2=16) at common/ppc/pixel.c:64
#1 0x10003da4 in check_all (cpu_ref=0, cpu_new=64) at tools/checkasm.c:71
#2 0x10005e18 in main (argc=<value optimized out>, argv=<value
optimized out>) at tools/checkasm.c:773
(gdb) disassemble $pc-32,$pc+32
Dump of assembler code for function pixel_sad_16x16_altivec:
0x1001a104 <pixel_sad_16x16_altivec+0>: stwu r1,-32(r1)
0x1001a108 <pixel_sad_16x16_altivec+4>: vspltisb v10,0
0x1001a10c <pixel_sad_16x16_altivec+8>: vor v11,v10,v10
0x1001a110 <pixel_sad_16x16_altivec+12>: li r0,16
0x1001a114 <pixel_sad_16x16_altivec+16>: mtctr r0
0x1001a118 <pixel_sad_16x16_altivec+20>: lvx v13,r0,r3
0x1001a11c <pixel_sad_16x16_altivec+24>: lvsl v1,r0,r3
0x1001a120 <pixel_sad_16x16_altivec+28>: addi r9,r3,15
0x1001a124 <pixel_sad_16x16_altivec+32>: lvx v0,r0,r9
0x1001a128 <pixel_sad_16x16_altivec+36>: vperm v13,v13,v0,v1
0x1001a12c <pixel_sad_16x16_altivec+40>: lvx v1,r0,r5
0x1001a130 <pixel_sad_16x16_altivec+44>: lvsl v12,r0,r5
0x1001a134 <pixel_sad_16x16_altivec+48>: addi r9,r5,15
0x1001a138 <pixel_sad_16x16_altivec+52>: lvx v0,r0,r9
0x1001a13c <pixel_sad_16x16_altivec+56>: vperm v1,v1,v0,v12
0x1001a140 <pixel_sad_16x16_altivec+60>: vmaxub v0,v13,v1
0x1001a144 <pixel_sad_16x16_altivec+64>: vminub v13,v13,v1
0x1001a148 <pixel_sad_16x16_altivec+68>: vsububm v0,v0,v13
0x1001a14c <pixel_sad_16x16_altivec+72>: vsum4ubs v11,v0,v11
0x1001a150 <pixel_sad_16x16_altivec+76>: add r3,r3,r4
0x1001a154 <pixel_sad_16x16_altivec+80>: add r5,r5,r6
0x1001a158 <pixel_sad_16x16_altivec+84>: bdnz+ 0x1001a118
<pixel_sad_16x16_altivec+20>
0x1001a15c <pixel_sad_16x16_altivec+88>: vsumsws v0,v11,v10
0x1001a160 <pixel_sad_16x16_altivec+92>: vspltw v0,v0,3
0x1001a164 <pixel_sad_16x16_altivec+96>: addi r9,r1,16
0x1001a168 <pixel_sad_16x16_altivec+100>: stvewx v0,r0,r9
0x1001a16c <pixel_sad_16x16_altivec+104>: lwz r3,16(r1)
0x1001a170 <pixel_sad_16x16_altivec+108>: addi r1,r1,32
0x1001a174 <pixel_sad_16x16_altivec+112>: blr
End of assembler dump.
(gdb)
It still looks Altivec code is enabled.... bummer!
Guillaume
More information about the x264-devel
mailing list