[vlc-devel] The "optimized" memory copies

Wed Aug 19 22:21:59 CEST 2009

On Aug 19, 2009, at 9:01 PM, Rémi Denis-Courmont wrote:

> 	Hello,
>
> I've been benchmarking vlc_memset() and vlc_memcpy() against the  
> built-in
> GCC-4.4 memset() and memcpy(). My system is swap-free (RAM size is
> preposterously large for my usage). Measurements were done on paged  
> page-
> aligned chunks of 256 megabytes each, against the thread time clock.  
> CPU is a
> single Intel Pentium 4/HT.
>
> It turns out that vlc_memset() was about 2% slower than plain  
> memset()... so I
> kinda wonder why we bother with implementing it.

> But then, vlc_memcpy() was an outrageous 35% slower than plain  
> memcpy() on no-
> first pass, and 200% (three times, yes!) slower on the first pass.  
> Sorry but
> WTF?

Remi,

This is an interesting note, I am also wondering what kind of scenario  
memcpy was optimized for. Can you provide your benchmark sources, so  
that we could also run it on other system?

AFAIK, vlc_memset() is just jumping to memset(), the indirection  
explains the 2%.

Pierre.