[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8

Fri Aug 25 09:33:36 CEST 2006

On Thu, 24 Aug 2006, Guillaume POIRIER wrote:
> I've also found out that even though my 4x4 DC also pass regression 
> tests, it doesn't seem to be called too often.
> It's likely to be because I had to restrict its use to the cases where 
> maxQdc < (1<<15).
> That can hopefully be improved if I manage to understand why I can't use 
> my 4x4dc with maxQdc < (1<<16) or more.

Because you use signed multiplication? That's why the mmx1 version is 
restricted to 15 bits.

> The attached patch now also features sub8x8_dct8 in Altivec in addition to 
> previous optimized routines. This new routine still need to be a bit cleaned 
> up, and the code need to be factorized in a macro, but it works.
>
> I've introduced another transpose8x8 routine, shamelessly taken from 
> FFmpeg's: the one on ppccommon.h didn't do what I wanted, but maybe it's just 
> because I didn't know how to use it.

I can't see any difference between TRANSPOSE8 and VEC_TRANSPOSE_8 other 
than that one replaces the operands in place while the other takes a 
source and a destination.
But if you still want a second version, please remove all the unsightly 
underscores. I thought that was done in ffmpeg, but I guess only the vc1 
functions were cleaned.

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html