[x264-devel] Re: [PATCH] IDCT8 and SA8D routines in Altivec
Guillaume POIRIER
guillaume-bzh.poirier at laposte.net
Mon Nov 27 20:59:25 CET 2006
Hi,
Manuel Rommel a écrit :
> Hi,
>
> Guillaume, in your second post this was added to your patch:
>
>> - CFLAGS="$CFLAGS -faltivec -fastf -mcpu=G4"
>> + CFLAGS="$CFLAGS -faltivec -fastf -mcpu=G5"
>
>
> It might be dangerous to specify -mcpu=G5 on every PPC since then gcc
> produces instructions which are only valid on a G5, so x264 is likely
> crash on a G4 oder G3.
Good catch. It is a local modification that wasn't meant to be merged.
> And I have a question about the use of VEC_ABS in your SA8D code, for
> example:
>
>> + vec_s16_t abs6v = VEC_ABS(sa8d6v);
>
> As far as I understand, this gets transformed to
>
> vec_s16_t abs6v = sa8d6v = vec_max( sa8d6v, vec_sub( zero_s16v,
> sa8d6v ) );
>
> This shouldn't alter the output but it could be possible to get rid of
> some instructions, if I understand it right.
It's identical, just faster.
vec_abs() actually also calls vec_splat(0), but there's already have a
null vector, so calling just vec_sub()/vec_max() instead of vec_abs()
saves some splats.
Guillaume
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list