[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
Loren Merritt
lorenm at u.washington.edu
Mon Sep 18 21:24:12 CEST 2006
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
On Mon, 18 Sep 2006, Guillaume POIRIER wrote:
> The attached patch adds *idct8* routines to my whole patchset.
>
> Please test and review.
>
> IDCT8 can be made faster with less loads and store, but right now, I don't
> know exactly how to do it. Suggestions welcome.
x264_sub8x8_dct8_altivec could use VEC_DIFF_H_8BYTE_ALIGNED.
pixel_sa8d_8x8_core_altivec could use a VEC_DIFF with one of the pointers
8byte aligned.
ALTIVEC_STORE_SUM_CLIP is 8byte aligned, so it could have two versions like
#define ALTIVEC_STORE_SUM_CLIP_ALIGN8_A(dest, idctv) {\
vec_u8_t dstv = vec_ld(0, dest);\
vec_s16_t idct_sh6 = vec_sra(idctv, sixv);\
vec_u16_t dst16h = vec_mergeh(zero_u8v, dstv);\
vec_u16_t dst16l = vec_mergel(zero_u8v, dstv);\
vec_s16_t sum16 = vec_adds(idct_sh6, (vec_s16_t)dst16h);\
vec_u8_t sum8 = vec_packsu(dst16l, sum16);\
vec_st(sum8, 0, dest);\
}
... and swap dst16l with dst16h for the other parity.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the x264-devel
mailing list