[x264-devel] 8x8 and 16x16 Altivec implementation of variance
Guillaume POIRIER
gpoirier at mplayerhq.hu
Sat Jan 24 14:53:10 CET 2009
On Sat, Jan 24, 2009 at 2:47 AM, Holger Lubitz
<Holger.Lubitz at informatik.uni-oldenburg.de> wrote:
>> The 8x8 doesn't such a big speed-up because the data is 8-bytes
>> aligned, not 16-bytes aligned, so it's necessary to permute it before
>> using it.
>
> I do not know much about altivec at all, but it seems the permute may be more
> expensive than a shift. Have you tried just shifting things into place?
vec_perm and vec_s(r|l)* have the same throughput and latencies.
That's what's cool about it ;-)
Guillaume
--
Only a very small fraction of our DNA does anything; the rest is all
comments and ifdefs.
Marilyn Monroe - "It's not true that I had nothing on. I had the radio on."
More information about the x264-devel
mailing list