[x264-devel] [PATCH 1/3] PPC: Improve SAD by using vec_extract
Michail Alvanos
malvanos at gmail.com
Sun Apr 14 12:45:09 CEST 2019
Improve sad functions by using vec_extract
instead of vec_splat and vec_ste.
Power9:
sad_8x8_altivec: 104 --> sad_8x8_altivec: 94
sad_8x16_altivec: 161 --> sad_8x16_altivec: 149
sad_16x8_altivec: 105 --> sad_16x8_altivec: 94
sad_16x16_altivec: 176 --> sad_16x16_altivec: 165
Power8:
sad_8x8_altivec: 125 --> sad_8x8_altivec: 101
sad_8x16_altivec: 224 --> sad_8x16_altivec: 206
sad_16x8_altivec: 117 --> sad_16x8_altivec: 100
sad_16x16_altivec: 234 --> sad_16x16_altivec: 218
---
common/ppc/pixel.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/common/ppc/pixel.c b/common/ppc/pixel.c
index 11424e36..32968b42 100644
--- a/common/ppc/pixel.c
+++ b/common/ppc/pixel.c
@@ -37,8 +37,6 @@
static int name( uint8_t *pix1, intptr_t i_pix1, \
uint8_t *pix2, intptr_t i_pix2 ) \
{ \
- ALIGNED_16( int sum ); \
- \
LOAD_ZERO; \
vec_u8_t pix1v, pix2v; \
vec_s32_t sumv = zero_s32v; \
@@ -53,9 +51,7 @@ static int name( uint8_t *pix1, intptr_t i_pix1, \
pix2 += i_pix2; \
} \
sumv = vec_sum##a( sumv, zero_s32v ); \
- sumv = vec_splat( sumv, b ); \
- vec_ste( sumv, 0, &sum ); \
- return sum; \
+ return vec_extract(sumv,b); \
}
PIXEL_SAD_ALTIVEC( pixel_sad_16x16_altivec, 16, 16, s, 3 )
--
2.17.1
More information about the x264-devel
mailing list