[RFC] New audio output architecture

Christophe Massiot massiot at via.ecp.fr
Fri May 3 00:42:05 CEST 2002

À (At) 22:46 +0200 2/05/2002, Gildas Bazin écrivait (wrote) :

>I don't agree with the "Mixing several input streams" stage.
>Two things bother me here:
>1) Doing this so early in the aout3 pipeline might save us some processing
>power but also removes a lot of flexibility because we loose precious
>information too early in the process.
>Take the example of different streams having different audio output levels
>(or gains). The user might want to adjust the gain of each of these streams
>with the help of an AOUT_FILTER (which could be a graphical mixer).

Yeah that's right, we should allow filter plug-ins before and after mixing.

>2) IMHO the "Mixing" and "Downmixing" stage should be merged into just one
>"Mixing" stage. After all there are pretty close to each other no ?

I'll do the same objection you did at #1. One could want to adjust 
the gain of individual channels _before_ downmixing to stereo output. 
For instance when the rear channels are too loud in the movie.

>We could maybe classify the audio channels in a stream into different types:

Well I think that's already the idea behind the so-called "5.1" format :-).

>And when we would have several input streams, we would also have several
>channels of the same type but belonging to different streams (eg. several
>CENTER channels).
>The job of the "Mixing" plugin would then be to mix channels of identical
>type together and then downmix the whole thing for the audio output plugin.
>That would give something like:
>   -------------------------------------------------+------------------------
>Conversion from the adec format to float32       | AOUT_CONVERTER plug-in
>                                                  +------------------------
>Resampling [optional but highly recommended]     | AOUT_FILTER plug-in #1
>                                                  +------------------------
>Sound effects [optional]                         | AOUT_FILTER plug-in #2
>                                                  +------------------------
>Mixing of input streams and Downmixing if the    | AOUT_FILTER plug-in #3
>output plug-in doesn't support as many channels  |
>[optional but highly recommended if you want to  |
>  hear all the channels from the streams]         |
>                                                  +------------------------
>Conversion from float32 to native output format  | AOUT_CONVERTER plug-in

I'd do resampling and mixing at the same time. For performance 
reasons. And there is no need to do resampling before sound effects, 
it can be done afterwards.

>As you might have noticed, the resampling stage has become a plugin. Even
>though it is crucial to implement resampling if you want to maintain the
>sound quality and synchronisation, I don't see any reason why it should be
>handled differently than an AOUT_FILTER. For me it has all the
>characteristics of an AOUT_FILTER. This also has the advantage to allow
>different resampling implementations.

Well we have of plenty of operations which do not have exactly the 
same properties :
- filter : 1 stream -> 1 stream, n channels -> n channels,
            i samples -> i samples
- resampling : 1 stream -> 1 stream, n channels -> n channels,
                 i samples -> j samples (NOT THE SAME DURATION)
- mixing : p streams -> 1 stream, n channels -> n channels,
            i samples -> i samples
- downmixing : 1 stream -> 1 stream, n channels -> m channels,
               i samples -> i samples

I have no problem calling all these « filters », though we'll 
probably need different « module capabilities ». We just need to know 
what we're talking about.

>A last idea about this aout3 pipeline: the user should be allowed to change
>the order of the plugins. If you want to save some processing power, why
>not do the mixing stage right at the begining for instance ???

Does « gstreamer » talk to you ? :+)

>I think we should support both float32 and fixed24 but not using #ifdefs.
>We should make the float32 converter used by default on CPUs with FPUs and
>the fixed24 converter used by default otherwise.
>That means we'll also have to program AOUT_FILTERs for both formats, but I
>think it's wise not to forget the embedded "market".
>One good point about having both the float32 and fixed24 AOUT_CONVERTERs at
>the same time is for example when a required AOUT_FILTER is only available
>for one of the formats (sometimes it's better to have a slow filter than
>nothing at all).

Let's be clear. Embedded systems (such as iPAQ) will never have more 
than stereo, and do not decode A52 sound (for the simple reason that 
all A52 decoders are float-based and in this discussion « embedded 
system » means « does not have an FPU »). So I do not think we'd need 
any plug-in more elaborate than resampling and downmixing (ie. pretty 
much what we already have).

In my views, we do not « support » two native formats, we just 
provide embedded users with an audio output « light ». Massive aout 
filters such as equalizers or downmixing will always be done in 
float32, and we'll never need that in the fixed-point world. So I 
think we could avoid the extra burden of building run-time bridges 
between the two worlds, it won't be used (and we already have a lot 
of work :-).

As a side note, maybe what we'll be writing could eventually become a library.

Christophe Massiot.

This is the vlc-devel mailing-list, see http://www.videolan.org/vlc/
To unsubscribe, please read http://www.videolan.org/lists.html
If you are in trouble, please contact <postmaster at videolan.org>

More information about the vlc-devel mailing list