[vlc-devel] Phosphor timing - forgot to mention
Juha Jeronen
juha.jeronen at jyu.fi
Sat Mar 5 19:17:29 CET 2011
Hi again,
It says so in the code comments, but this was so curious that I thought
to mention it separately:
Even though the luma processing in DarkenField() is a trivial operation,
and both the C and MMX versions are vectorized, the MMX version is about
twice faster.
Timing both, I got 250us per frame with MMX and 500us without. The only
reason I can think of is that, in the MMX version, I preloaded the shift
and bitmask values into registers, so that only the actual picture data
needs to access the memory bus (or even L1 cache). This is assuming that
the C version doesn't automatically do a similar preload.
-J
More information about the vlc-devel
mailing list