[x264-devel] [PATCH] faster mc_chroma_altivec
maaanuuu at gmx.net
maaanuuu at gmx.net
Mon Feb 2 22:47:30 CET 2009
Hello,
the attached patch improves mc_chroma_altivec:
Now VEC_LOAD is used instead of VEC_LOAD_G, vec_mladd is used more
efficient and the loop is unrolled 2x.
mc_chroma_w4_altivec now needs dst to be aligned to a 4 byte boundary,
is that OK?
Finally, I put width == 2 into its own function because at the moment
the code that is used for it is actually slower than plain C.
The patch passes checkasm and leads to a 2-3% performance gain overall
using the default settings. Please note that I have NOT done extensive
regression tests.
Comments and suggestions are welcome :)
Manuel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: faster_mc_chroma_altivec.diff
Type: application/octet-stream
Size: 10345 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20090202/bc0b8a30/attachment.obj
-------------- next part --------------
More information about the x264-devel
mailing list