[x264-devel] 8x8 and 16x16 Altivec implementation of variance

Guillaume POIRIER gpoirier at mplayerhq.hu
Sat Jan 24 14:53:10 CET 2009

On Sat, Jan 24, 2009 at 2:47 AM, Holger Lubitz
<Holger.Lubitz at informatik.uni-oldenburg.de> wrote:
>> The 8x8 doesn't such a big speed-up because the data is 8-bytes
>> aligned, not 16-bytes aligned, so it's necessary to permute it before
>> using it.
> I do not know much about altivec at all, but it seems the permute may be more
> expensive than a shift. Have you tried just shifting things into place?

vec_perm and vec_s(r|l)* have the same throughput and latencies.
That's what's cool about it ;-)

Only a very small fraction of our DNA does anything; the rest is all
comments and ifdefs.

Marilyn Monroe  - "It's not true that I had nothing on. I had the radio on."

More information about the x264-devel mailing list