[vlc-devel] [PATCH 00/16] libdvbcsa: improve performance
glenvt18
glenvt18 at gmail.com
Fri Jun 26 13:19:54 CEST 2015
Hi folks.
I'm posting it here as there is no libdvbcsa or videolan-devel
mailing lists.
This patch series considerably (up to 3 times) improves libdvbcsa
performance on both x86 and ARM platforms. It also introduces NEON
support on ARM.
Here are some benchmarks.
x86_64 Intel Celeron 847 @ 1100Mhz:
uint32 81/185
uint64 96/276
sse2 129/348
ssse3 -/386
ffdecsa(sse2) 325
2.99x faster, 1.19x faster than ffdecsa
x86_64 Intel Atom D425 @ 1800Mhz:
uint32 76/134
uint64 109/183
sse2 117/228
ssse3 -/218
ffdecsa(sse2) 180
1.95x faster, 1.27x faster than ffdecsa
ARM Cortex-A7 @ 912Mhz:
uint32 32/48
uint64 29/33
neon -/84
2.63x faster
Notes:
line format: bs_word old/new [Mbit/s]
"old" means the latest svn snapshot.
Only decryption benchmarks are shown. Encryption figures are nearly
the same.
Tests were compiled with gcc 4.8.
I also tested 32-bit and 64-bit versions on powerpc (big-endian) with qemu.
Altivec build is broken. I could fix it, but have no ppc hardware to make
sure the performance wont suffer.
Please review.
glenvt18 (16):
block cipher: improve performance
stream cipher: refactoring
stream cipher: optimizations
Add ARM NEON support
bitslice transform: rewrite
neon: add matrix transpose macros
block cipher: use one lookup table for sbox and permutation
neon: add deinterleaving macro
ssse3: add deinterleaving macro and SSSE3 option
Add deinterleaving test case.
stream cipher: use the same buffer for input and output
Add matrix transpose ops tests
Fix automake version check
Fix C++ compilation
Change compiler options
Remove unused attribute
bootstrap | 4 +-
configure.ac | 26 ++-
src/Makefile.am | 26 ++-
src/dvbcsa_algo.c | 2 +-
src/dvbcsa_bs.h | 12 +-
src/dvbcsa_bs_algo.c | 2 +-
src/dvbcsa_bs_block.c | 267 ++++++++++++++++++------
src/dvbcsa_bs_neon.h | 95 +++++++++
src/dvbcsa_bs_sse.h | 19 ++
src/dvbcsa_bs_stream.c | 410 +++----------------------------------
src/dvbcsa_bs_stream_kernel.h | 22 ++
src/dvbcsa_bs_stream_kernel.inc | 261 +++++++++++++++++++++++
src/dvbcsa_bs_transpose.c | 112 ----------
src/dvbcsa_bs_transpose.h | 71 +++++++
src/dvbcsa_bs_transpose128.c | 209 -------------------
src/dvbcsa_bs_transpose32.c | 185 -----------------
src/dvbcsa_bs_transpose64.c | 186 -----------------
src/dvbcsa_bs_transpose_block.c | 97 +++++++++
src/dvbcsa_bs_transpose_stream.c | 231 +++++++++++++++++++++
src/dvbcsa_bs_transpose_stream32.c | 150 ++++++++++++++
src/dvbcsa_pv.h | 78 ++++++-
test/testbitslice.c | 2 +-
test/testbsops.c | 132 ++++++++++++
test/testdec.c | 2 +-
test/testenc.c | 2 +-
25 files changed, 1421 insertions(+), 1182 deletions(-)
create mode 100644 src/dvbcsa_bs_neon.h
create mode 100644 src/dvbcsa_bs_stream_kernel.h
create mode 100644 src/dvbcsa_bs_stream_kernel.inc
delete mode 100644 src/dvbcsa_bs_transpose.c
create mode 100644 src/dvbcsa_bs_transpose.h
delete mode 100644 src/dvbcsa_bs_transpose128.c
delete mode 100644 src/dvbcsa_bs_transpose32.c
delete mode 100644 src/dvbcsa_bs_transpose64.c
create mode 100644 src/dvbcsa_bs_transpose_block.c
create mode 100644 src/dvbcsa_bs_transpose_stream.c
create mode 100644 src/dvbcsa_bs_transpose_stream32.c
--
1.9.1
More information about the vlc-devel
mailing list