[x264-devel] Patch - Altivec Quant 4x4x4

Derek Buitenhuis derek.buitenhuis at gmail.com
Wed Oct 2 13:56:56 CEST 2013


On 9/29/2013 5:23 PM, Philipp Sibler wrote:
> Hi x264,
> 
> this patch introduces an Altivec version of the 4x4x4 quantization step. 
> On the current master branch the 4x4x4 quantization on PowerPC Altivec 
> machines defaults to the plain scalar C routine.

The PPC brotherhood still exists, I see!

> Patch was tested on a PowerMac G4 and generates an encoding speedup of 
> about 14 percent there.

Overall or just quantization?

>>From 951350060c745c1c33bf814f87621e2763143a50 Mon Sep 17 00:00:00 2001
> From: Philipp Sibler <philipp.sibler at gmail.com>
> Date: Sun, 29 Sep 2013 17:48:36 +0200
> Subject: [PATCH] Introduced Altivec version of quant 4x4x4

s/Introduced/Introduce/

> 
> ---
>  common/ppc/quant.c |   17 +++++++++++++++++
>  common/ppc/quant.h |    1 +
>  common/quant.c     |    1 +
>  3 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/common/ppc/quant.c b/common/ppc/quant.c
> index f11938a..0fff340 100644
> --- a/common/ppc/quant.c
> +++ b/common/ppc/quant.c
> @@ -90,6 +90,23 @@ int x264_quant_4x4_altivec( int16_t dct[16], uint16_t mf[16], uint16_t bias[16]
>      return vec_any_ne(nz, zero_s16v);
>  }
>  
> +int x264_quant_4x4x4_altivec( int16_t dct[4][16], uint16_t mf[16], uint16_t bias[16] )
> +{
> +    int nza = 0;
> +    int nz = 0;
> +
> +	nz = x264_quant_4x4_altivec(dct[0], mf, bias);
> +	nza |= (!!nz);
> +	nz = x264_quant_4x4_altivec(dct[1], mf, bias);
> +	nza |= (!!nz)<<1;
> +	nz = x264_quant_4x4_altivec(dct[2], mf, bias);
> +	nza |= (!!nz)<<2;
> +	nz = x264_quant_4x4_altivec(dct[3], mf, bias);
> +	nza |= (!!nz)<<3;

x264 doesn't allow tabs.

I would wonder if this is optimal, but it's still a gain, and
probably nobody else will write Altivec code... so it looks
pretty good to me then.
                                                                 \
>          pf->quant_8x8 = x264_quant_8x8_altivec;
> +		pf->quant_4x4x4 = x264_quant_4x4x4_altivec;

Tabs again.

I assume this has been run through x264's regression testing.

- Derek


More information about the x264-devel mailing list