[x264-devel] [patch] Microoptimize assembler code (1/n)
Josef Zlomek
josef.zlomek at xeris.cz
Mon Apr 25 09:15:34 CEST 2005
Hi,
I have made the x264_pixel_satd_* functions to precompute
the values 3*stride1 and 3*stride2 into registers on AMD64
and to use this precomputed values in LOAD_DIFF_INC_4x4 macro.
The result is that the base address does not have to be increased twice
in LOAD_DIFF_INC_4x4 macro, which removes several instructions and
depencencies between instructions.
Since the LOAD_DIFF_INC_4x4 sequence is used many times in hot functions
the result is a (minor) speedup.
The patch is located at http://zlomek.jikos.cz/x264/x264-stride.patch
Josef
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list