[x264-devel] 8x8 and 16x16 Altivec implementation of variance

Loren Merritt lorenm at u.washington.edu
Thu Jan 22 22:48:26 CET 2009


On Thu, 22 Jan 2009, Guillaume POIRIER wrote:

>+ vec_u16_t mule = vec_mule(pix_v, pix_v);
>+ vec_u16_t mulo = vec_mulo(pix_v, pix_v);
>+ vec_u32_t mule_h = vec_u16_to_u32_h(mule);
>+ vec_u32_t mule_l = vec_u16_to_u32_l(mule);
>+ vec_u32_t mulo_h = vec_u16_to_u32_h(mulo);
>+ vec_u32_t mulo_l = vec_u16_to_u32_l(mulo);
>+ vec_u32_t mule_sqr = vec_add(mule_h, mule_l);
>+ vec_u32_t mulo_sqr = vec_add(mulo_h, mulo_l);
>+ vec_u32_t mul_sqr = vec_add(mule_sqr, mulo_sqr);
>+ sqr_v = vec_add(sqr_v, mul_sqr);

replace all that with:
sqr_v = vec_msum(pix_v, pix_v, sqr_v);

--Loren Merritt


More information about the x264-devel mailing list