[x264-devel] [PATCH] AltiVec implementation of hadamard_ac routines

Loren Merritt lorenm at u.washington.edu
Mon Feb 2 18:58:58 CET 2009


On Sun, 1 Feb 2009, Guillaume POIRIER wrote:

> I'll try to find some time to understand the optimizations in x86's
> SIMD routines to use them in my AltiVec implementation of hadamard_ac.

Simple: you don't need a 8x8 transpose, you need a 2x2 transpose of 4x4 
blocks. So do only one vec_merge pass. Which will leave coefs in a 
different order within each vector than 2x2 transpose would, but satd 
doesn't care.

And whenever Holger is ready to publish his satd optimization, some of 
that will be relevant to altivec too.

--Loren Merritt


More information about the x264-devel mailing list