[vlc-devel] [PATCH] add ARM/NEON version of simple channel mixer

Måns Rullgård mans at mansr.com
Fri Oct 5 00:29:07 CEST 2012


Jean-Baptiste Kempf <jb at videolan.org> writes:

> On Thu, Oct 04, 2012 at 11:06:42PM +0300, Rémi Denis-Courmont wrote :
>> Le jeudi 4 octobre 2012 22:56:08, Jean-Baptiste Kempf a écrit :
>> > On Thu, Oct 04, 2012 at 10:46:44PM +0300, Rémi Denis-Courmont wrote :
>> > > This seems to lack any sort of unrolling, so the speed will be much worse
>> > > than it could be. If we just want lame optimizations, I would argue for
>> > > intrinsics rather than ASM.
>> > > 
>> > > No hard objections though.
>> > 
>> > Can we commit with a warning about this unrolling, for now?
>> > Or is it not good enough ?
>> 
>> I don't know. Is it significantly faster than GCC code?
>
> It seems, to me, that when I benchmarked the functions it was between 3x
> and 8x depending on the mode. But maybe I mis-remember.

Almost anything will be much faster than gcc on a cortex-a8 due to the
lame scalar floating-point on that core.

-- 
Måns Rullgård
mans at mansr.com




More information about the vlc-devel mailing list