[vlc-devel] memcpy
Rafaël Carré
funman at videolan.org
Sat May 5 20:21:22 CEST 2012
Hello,
I made some benchmark of memcpy() function we use on x86.
Attached file compares normal memcpy to our implementation in
modules/mmx/fastmemcpy.h (it includes fastmemcpy.h)
I made aligned (source/dest are aligned on 8kB) and unaligned (source is
aligned on 8K+1 and dest on 8K+2 so all the transfers are unaligned)
Linux:
block size 4147200
unaligned: libc 27.87% faster than vlc.
aligned: libc 21.22% faster than vlc.
block size 4147200
unaligned: libc 7.24% faster than vlc.
aligned: libc 9.08% faster than vlc.
block size 4147200
unaligned: libc 10.52% faster than vlc.
aligned: libc 9.12% faster than vlc.
block size 3840
unaligned: libc 280.86% faster than vlc.
aligned: libc 381.33% faster than vlc.
block size 3840
unaligned: libc 210.77% faster than vlc.
aligned: libc 403.61% faster than vlc.
block size 3840
unaligned: libc 337.14% faster than vlc.
aligned: libc 388.02% faster than vlc.
OSX:
block size 4147200
unaligned: libc 12.21% faster than vlc.
aligned: libc 7.07% faster than vlc.
block size 4147200
unaligned: libc 11.54% faster than vlc.
aligned: libc 6.34% faster than vlc.
block size 4147200
unaligned: libc 11.78% faster than vlc.
aligned: libc 6.41% faster than vlc.
block size 3840
unaligned: libc 380.73% faster than vlc.
aligned: libc 372.67% faster than vlc.
block size 3840
unaligned: libc 325.50% faster than vlc.
aligned: libc 397.35% faster than vlc.
block size 3840
unaligned: libc 355.08% faster than vlc.
aligned: libc 425.87% faster than vlc.
Win64:
For 1920*1080*2, memcpy is between 5 and 10% slower than vlc, and for
smaller sizes like 1080*2, I didn't get precise enough measurement.
Therefore it would make sense to only use that memcpy on Windows, and
maybe import glibc memcpy() since it seems to be better.
Since I don't have old MMX only CPU, perhaps it would make sense to
leave fast_memcpy for i386 (and Windows), and always use memcpy() on
x86_64 ?
Specs:
Linux Ubuntu 12.04 x86_64 / Windows 7 x64
CPU: model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
OSX 10.8
CPU: core2duo 2.26GHz (macbook late 2009 model)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memcpy.c
Type: text/x-csrc
Size: 2292 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20120505/7917c2e7/attachment.c>
More information about the vlc-devel
mailing list