[x264-devel] Re: [patch] quant_4x4 and quant_8x8 mmx versions

Wed Jul 27 16:55:43 CEST 2005

On Tue, 26 Jul 2005, Alexander Izvorski wrote:

> Here is a patch with mmxext versions of quant_4x4 and quant_8x8.  The
> subroutines themselves are ~4 times faster, resulting in a noticeable
> overall speedup, up to 15% depending on compression options used.
> Also attached is an excerpt from a profiling run (before and after).

Thanks.

> ;;; ebx is quant_mf[i_mf]
[...]
>     movq mm1, [ebx]
>     movq mm2, [ebx+8]
>     packssdw mm1, mm2

If using a custom quant matrix with sufficiently low entries (<=4 in some 
cases) then quant_mf won't fit in 16 bits. And since they're unsigned, 
this will break even at entries <=8. (The default entries are 16, but even 
the JVT preset matrices go down to 6.)

Suggested solution:
Put quant_*_mmxext in function pointers. If the current cqm is compatible, 
pre-pack h->quant*_mf to 16 bits and use the mmx functions. Otherwise, 
don't pack and use C functions.

--Loren Merritt

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html