[x264-devel] Re: [patch] quant_4x4 and quant_8x8 mmx versions
Loren Merritt
lorenm at u.washington.edu
Wed Jul 27 16:55:43 CEST 2005
On Tue, 26 Jul 2005, Alexander Izvorski wrote:
> Here is a patch with mmxext versions of quant_4x4 and quant_8x8. The
> subroutines themselves are ~4 times faster, resulting in a noticeable
> overall speedup, up to 15% depending on compression options used.
> Also attached is an excerpt from a profiling run (before and after).
Thanks.
> ;;; ebx is quant_mf[i_mf]
[...]
> movq mm1, [ebx]
> movq mm2, [ebx+8]
> packssdw mm1, mm2
If using a custom quant matrix with sufficiently low entries (<=4 in some
cases) then quant_mf won't fit in 16 bits. And since they're unsigned,
this will break even at entries <=8. (The default entries are 16, but even
the JVT preset matrices go down to 6.)
Suggested solution:
Put quant_*_mmxext in function pointers. If the current cqm is compatible,
pre-pack h->quant*_mf to 16 bits and use the mmx functions. Otherwise,
don't pack and use C functions.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list