[vlc-devel] memcpy

Rafaël Carré funman at videolan.org
Sat May 5 20:21:22 CEST 2012


Hello,

I made some benchmark of memcpy() function we use on x86.

Attached file compares normal memcpy to our implementation in
modules/mmx/fastmemcpy.h (it includes fastmemcpy.h)

I made aligned (source/dest are aligned on 8kB) and unaligned (source is
aligned on 8K+1 and dest on 8K+2 so all the transfers are unaligned)

Linux:

	block size 4147200
unaligned: libc  27.87% faster than vlc.
  aligned: libc  21.22% faster than vlc.
	block size 4147200
unaligned: libc   7.24% faster than vlc.
  aligned: libc   9.08% faster than vlc.
	block size 4147200
unaligned: libc  10.52% faster than vlc.
  aligned: libc   9.12% faster than vlc.

	block size 3840
unaligned: libc 280.86% faster than vlc.
  aligned: libc 381.33% faster than vlc.
	block size 3840
unaligned: libc 210.77% faster than vlc.
  aligned: libc 403.61% faster than vlc.
	block size 3840
unaligned: libc 337.14% faster than vlc.
  aligned: libc 388.02% faster than vlc.


OSX:

	block size 4147200
unaligned: libc  12.21% faster than vlc.
  aligned: libc   7.07% faster than vlc.
	block size 4147200
unaligned: libc  11.54% faster than vlc.
  aligned: libc   6.34% faster than vlc.
	block size 4147200
unaligned: libc  11.78% faster than vlc.
  aligned: libc   6.41% faster than vlc.

	block size 3840
unaligned: libc 380.73% faster than vlc.
  aligned: libc 372.67% faster than vlc.
	block size 3840
unaligned: libc 325.50% faster than vlc.
  aligned: libc 397.35% faster than vlc.
	block size 3840
unaligned: libc 355.08% faster than vlc.
  aligned: libc 425.87% faster than vlc.


Win64:
For 1920*1080*2, memcpy is between 5 and 10% slower than vlc, and for
smaller sizes like 1080*2, I didn't get precise enough measurement.


Therefore it would make sense to only use that memcpy on Windows, and
maybe import glibc memcpy() since it seems to be better.

Since I don't have old MMX only CPU, perhaps it would make sense to
leave fast_memcpy for i386 (and Windows), and always use memcpy() on
x86_64 ?


Specs:
Linux Ubuntu 12.04 x86_64 / Windows 7 x64
CPU: model name	: Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz

OSX 10.8
CPU: core2duo 2.26GHz (macbook late 2009 model)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memcpy.c
Type: text/x-csrc
Size: 2292 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20120505/7917c2e7/attachment.c>


More information about the vlc-devel mailing list