[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
David Wolstencroft
lordrpi at gmail.com
Tue Sep 19 11:27:11 CEST 2006
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
Sooo, ok, since I'm wired (don't know why I can't sleep....)
if (dst is 16 byte aligned)
#define ALTIVEC_STORE_SUM_CLIP_ALIGN8_A(dest, idctv) {\
vec_u8_t dstv = vec_ld(0, dest);\
vec_s16_t idct_sh6 = vec_sra(idctv, sixv);\
vec_u16_t dst16h = vec_mergeh(zero_u8v, dstv);\
vec_u16_t dst16l = vec_mergel(zero_u8v, dstv);\
vec_s16_t sum16 = vec_adds(idct_sh6, (vec_s16_t)dst16h);\
vec_u8_t sum8 = vec_packsu(sum16, dst16l);\ <- I swear
pengvado made a mistake here, if that's possible
vec_st(sum8, 0, dest);\
else (8 byte aligned but not 16 byte aligned)
#define ALTIVEC_STORE_SUM_CLIP_ALIGN8_A(dest, idctv) {\
vec_u8_t dstv = vec_ld(0, dest);\
vec_s16_t idct_sh6 = vec_sra(idctv, sixv);\
vec_u16_t dst16h = vec_mergeh(zero_u8v, dstv);\
vec_u16_t dst16l = vec_mergel(zero_u8v, dstv);\
vec_s16_t sum16 = vec_adds(idct_sh6, (vec_s16_t)dst16);\
vec_u8_t sum8 = vec_packsu(dst16h, sum16);\
vec_st(sum8, 0, dest);\
On Sep 18, 2006, at 12:24 PM, Loren Merritt wrote:
> #define ALTIVEC_STORE_SUM_CLIP_ALIGN8_A(dest, idctv) {\
> vec_u8_t dstv = vec_ld(0, dest);\
> vec_s16_t idct_sh6 = vec_sra(idctv, sixv);\
> vec_u16_t dst16h = vec_mergeh(zero_u8v, dstv);\
> vec_u16_t dst16l = vec_mergel(zero_u8v, dstv);\
> vec_s16_t sum16 = vec_adds(idct_sh6, (vec_s16_t)dst16h);\
> vec_u8_t sum8 = vec_packsu(dst16l, sum16);\
> vec_st(sum8, 0, dest);\
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.videolan.org/pipermail/x264-devel/attachments/20060919/5044f112/attachment.htm
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the x264-devel
mailing list