[x264-devel] Re: [PATCH] IDCT8 and SA8D routines in Altivec

Guillaume POIRIER guillaume-bzh.poirier at laposte.net
Mon Nov 27 20:59:25 CET 2006


Hi,

Manuel Rommel a écrit :
> Hi,
> 
> Guillaume, in your second post this was added to your patch:
> 
>> -      CFLAGS="$CFLAGS -faltivec -fastf -mcpu=G4"
>> +      CFLAGS="$CFLAGS -faltivec -fastf -mcpu=G5"
> 
> 
> It might be dangerous to specify -mcpu=G5 on every PPC since then gcc 
> produces instructions which are only valid on a G5, so x264 is likely 
> crash on a G4 oder G3.

Good catch. It is a local modification that wasn't meant to be merged.


> And I have a question about the use of VEC_ABS in your SA8D code, for 
> example:
> 
>> +    vec_s16_t abs6v = VEC_ABS(sa8d6v);
> 
> As far as I understand, this gets transformed to
> 
> vec_s16_t abs6v =     sa8d6v     = vec_max( sa8d6v, vec_sub( zero_s16v, 
> sa8d6v ) );
> 
> This shouldn't alter the output but it could be possible to get rid of 
> some instructions, if I understand it right.

It's identical, just faster.

vec_abs() actually also calls vec_splat(0), but there's already have a 
null vector, so calling just vec_sub()/vec_max() instead of vec_abs() 
saves some splats.

Guillaume

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list