[vlc-devel] The "optimized" memory copies
Pierre d'Herbemont
pdherbemont at gmail.com
Wed Aug 19 22:21:59 CEST 2009
On Aug 19, 2009, at 9:01 PM, Rémi Denis-Courmont wrote:
> Hello,
>
> I've been benchmarking vlc_memset() and vlc_memcpy() against the
> built-in
> GCC-4.4 memset() and memcpy(). My system is swap-free (RAM size is
> preposterously large for my usage). Measurements were done on paged
> page-
> aligned chunks of 256 megabytes each, against the thread time clock.
> CPU is a
> single Intel Pentium 4/HT.
>
> It turns out that vlc_memset() was about 2% slower than plain
> memset()... so I
> kinda wonder why we bother with implementing it.
> But then, vlc_memcpy() was an outrageous 35% slower than plain
> memcpy() on no-
> first pass, and 200% (three times, yes!) slower on the first pass.
> Sorry but
> WTF?
Remi,
This is an interesting note, I am also wondering what kind of scenario
memcpy was optimized for. Can you provide your benchmark sources, so
that we could also run it on other system?
AFAIK, vlc_memset() is just jumping to memset(), the indirection
explains the 2%.
Pierre.
More information about the vlc-devel
mailing list