[x264-devel] [PATCH] Add all remaining 16x16 predict Altivec routines

Guillaume POIRIER gpoirier at mplayerhq.hu
Wed Jan 14 21:17:10 CET 2009


Hello,

2009/1/14 Guillaume Poirier <gpoirier at mplayerhq.hu>:

> Even if 64-bits fast indeed faster, it's nowhere as fast as the new
>
> intra_predict_16x16_h_altivec in attached patch.
> Here are the benchmark figures on PPC7450:
> intra_predict_16x16_dc_c: 43
> intra_predict_16x16_dc_altivec: 25
> intra_predict_16x16_dc8_c: 25
> intra_predict_16x16_dc8_altivec: 12
> intra_predict_16x16_dcl_c: 39
> intra_predict_16x16_dcl_altivec: 21
> intra_predict_16x16_dct_c: 39
> intra_predict_16x16_dct_altivec: 21
> intra_predict_16x16_h_c: 45
> intra_predict_16x16_h_altivec: 22
> intra_predict_16x16_p_c: 433
> intra_predict_16x16_p_altivec: 65
> intra_predict_16x16_v_c: 26
> intra_predict_16x16_v_altivec: 21
>
> Please try on other CPUs if you can, but I believe that the speed-up should be consistent across all.

Here are the figure for PPC970MP:
intra_predict_16x16_dc_c: 21
intra_predict_16x16_dc_altivec: 16
intra_predict_16x16_dc8_c: 18
intra_predict_16x16_dc8_altivec: 9
intra_predict_16x16_dcl_c: 19
intra_predict_16x16_dcl_altivec: 12
intra_predict_16x16_dct_c: 19
intra_predict_16x16_dct_altivec: 12
intra_predict_16x16_h_c: 19
intra_predict_16x16_h_altivec: 9
intra_predict_16x16_p_c: 159
intra_predict_16x16_p_altivec: 26
intra_predict_16x16_v_c: 19
intra_predict_16x16_v_altivec: 10


Since it's faster on all CPUs, I applied version 3 of my patchset.

Guillaume
-- 
Only a very small fraction of our DNA does anything; the rest is all
comments and ifdefs.

Diogenes  - "What I like to drink most is wine that belongs to others."


More information about the x264-devel mailing list