[x264-devel] [PATCH 1/3] PPC: Improve SAD by using vec_extract

Michail Alvanos malvanos at gmail.com
Sun Apr 14 12:45:09 CEST 2019


Improve sad functions by using vec_extract
instead of vec_splat and vec_ste.

Power9:
sad_8x8_altivec: 104 --> sad_8x8_altivec: 94
sad_8x16_altivec: 161 --> sad_8x16_altivec: 149
sad_16x8_altivec: 105 --> sad_16x8_altivec: 94
sad_16x16_altivec: 176 --> sad_16x16_altivec: 165

Power8:
sad_8x8_altivec: 125 --> sad_8x8_altivec: 101
sad_8x16_altivec: 224 --> sad_8x16_altivec: 206
sad_16x8_altivec: 117 --> sad_16x8_altivec: 100
sad_16x16_altivec: 234 --> sad_16x16_altivec: 218

---
 common/ppc/pixel.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/common/ppc/pixel.c b/common/ppc/pixel.c
index 11424e36..32968b42 100644
--- a/common/ppc/pixel.c
+++ b/common/ppc/pixel.c
@@ -37,8 +37,6 @@
 static int name( uint8_t *pix1, intptr_t i_pix1,       \
                  uint8_t *pix2, intptr_t i_pix2 )      \
 {                                                      \
-    ALIGNED_16( int sum );                             \
-                                                       \
     LOAD_ZERO;                                         \
     vec_u8_t  pix1v, pix2v;                            \
     vec_s32_t sumv = zero_s32v;                        \
@@ -53,9 +51,7 @@ static int name( uint8_t *pix1, intptr_t i_pix1,       \
         pix2 += i_pix2;                                \
     }                                                  \
     sumv = vec_sum##a( sumv, zero_s32v );              \
-    sumv = vec_splat( sumv, b );                       \
-    vec_ste( sumv, 0, &sum );                          \
-    return sum;                                        \
+    return vec_extract(sumv,b);                        \
 }
 
 PIXEL_SAD_ALTIVEC( pixel_sad_16x16_altivec, 16, 16, s,  3 )
-- 
2.17.1



More information about the x264-devel mailing list