[x264-devel] [PATCH] faster mc_chroma_altivec

maaanuuu at gmx.net maaanuuu at gmx.net
Mon Feb 2 22:47:30 CET 2009


Hello,

the attached patch improves mc_chroma_altivec:

Now VEC_LOAD is used instead of VEC_LOAD_G, vec_mladd is used more  
efficient and the loop is unrolled 2x.
mc_chroma_w4_altivec now needs dst to be aligned to a 4 byte boundary,  
is that OK?
Finally, I put width == 2 into its own function because at the moment  
the code that is used for it is actually slower than plain C.

The patch passes checkasm and leads to a 2-3% performance gain overall  
using the default settings. Please note that I have NOT done extensive  
regression tests.
Comments and suggestions are welcome :)


Manuel



-------------- next part --------------
A non-text attachment was scrubbed...
Name: faster_mc_chroma_altivec.diff
Type: application/octet-stream
Size: 10345 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20090202/bc0b8a30/attachment.obj 
-------------- next part --------------



More information about the x264-devel mailing list