[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8
Loren Merritt
lorenm at u.washington.edu
Fri Aug 25 09:33:36 CEST 2006
On Thu, 24 Aug 2006, Guillaume POIRIER wrote:
> I've also found out that even though my 4x4 DC also pass regression
> tests, it doesn't seem to be called too often.
> It's likely to be because I had to restrict its use to the cases where
> maxQdc < (1<<15).
> That can hopefully be improved if I manage to understand why I can't use
> my 4x4dc with maxQdc < (1<<16) or more.
Because you use signed multiplication? That's why the mmx1 version is
restricted to 15 bits.
> The attached patch now also features sub8x8_dct8 in Altivec in addition to
> previous optimized routines. This new routine still need to be a bit cleaned
> up, and the code need to be factorized in a macro, but it works.
>
> I've introduced another transpose8x8 routine, shamelessly taken from
> FFmpeg's: the one on ppccommon.h didn't do what I wanted, but maybe it's just
> because I didn't know how to use it.
I can't see any difference between TRANSPOSE8 and VEC_TRANSPOSE_8 other
than that one replaces the operands in place while the other takes a
source and a destination.
But if you still want a second version, please remove all the unsightly
underscores. I thought that was done in ffmpeg, but I guess only the vc1
functions were cleaned.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list