[RFC] New audio output architecture

Gildas Bazin gbazin at netcourrier.com
Thu May 2 22:46:09 CEST 2002

On Wednesday 01 May 2002 23:44, Christophe Massiot wrote:
> In this document I would like to propose ideas for a new audio output 
> architecture, code-named aout3.

Yeah... I'm a big fan of your proposal :)
This idea was also floating in my head, but I must say you put much more 
thinking into it than I did ;-) This document is really a nice piece of 

> Since all internal calculations will be done on float32, we need a 
> bunch of converters to and from float32. The flow of operations on 
> data is as follows :
> -------------------------------------------------+------------------------
>   Conversion from the adec format to float32     | AOUT_CONVERTER plug-in
>                                                  +------------------------
>   Mixing several input streams & resampling      | main audio mixer module
>                                                  +------------------------
>   Sound effects [optional]                       | AOUT_FILTER plug-in
>                                                  +------------------------
>   Downmixing if the output plug-in doesn't       | AOUT_FILTER plug-in #2
>   support as many channels                       |
>                                                  +------------------------
>   Conversion from float32 to native output       | AOUT_CONVERTER plug-in
>   format                                         |
> -------------------------------------------------+------------------------

I'm a big fan of the AOUT_CONVERTER and AOUT_FILTER plugins, it reminds me 
of the chroma and filter plugins from the video output :)

I've got a few objections though ;-)

I don't agree with the "Mixing several input streams" stage.
Two things bother me here:

1) Doing this so early in the aout3 pipeline might save us some processing 
power but also removes a lot of flexibility because we loose precious 
information too early in the process.
Take the example of different streams having different audio output levels 
(or gains). The user might want to adjust the gain of each of these streams 
with the help of an AOUT_FILTER (which could be a graphical mixer).

2) IMHO the "Mixing" and "Downmixing" stage should be merged into just one 
"Mixing" stage. After all there are pretty close to each other no ?
We could maybe classify the audio channels in a stream into different types:


And when we would have several input streams, we would also have several 
channels of the same type but belonging to different streams (eg. several 
CENTER channels).
The job of the "Mixing" plugin would then be to mix channels of identical 
type together and then downmix the whole thing for the audio output plugin.

That would give something like:

  Conversion from the adec format to float32       | AOUT_CONVERTER plug-in
  Resampling [optional but highly recommended]     | AOUT_FILTER plug-in #1
  Sound effects [optional]                         | AOUT_FILTER plug-in #2
  Mixing of input streams and Downmixing if the    | AOUT_FILTER plug-in #3
  output plug-in doesn't support as many channels  |
  [optional but highly recommended if you want to  |
   hear all the channels from the streams]         |
  Conversion from float32 to native output format  | AOUT_CONVERTER plug-in

As you might have noticed, the resampling stage has become a plugin. Even 
though it is crucial to implement resampling if you want to maintain the 
sound quality and synchronisation, I don't see any reason why it should be 
handled differently than an AOUT_FILTER. For me it has all the 
characteristics of an AOUT_FILTER. This also has the advantage to allow 
different resampling implementations.

A last idea about this aout3 pipeline: the user should be allowed to change 
the order of the plugins. If you want to save some processing power, why 
not do the mixing stage right at the begining for instance ???

> Notes
> [*] I'm not completely sure about that. The consequences of float32 
> on embedded systems without hardware FPU support need to be 
> evaluated. For these systems, fixed24 (the native format of libmad) 
> may be more clever. Perhaps it would be a good idea to have a version 
> of the audio mixer using fixed24 for embedded systems. Caution, I'm 
> not saying that we should support two native formats at the same 
> time, I'm just saying that there could be a #define AOUT_FORMAT 
> fixed24 for some architectures. This implies adapting the sound 
> effects and downmixing modules too, but embedded systems probably do 
> not need such complicated things...

I think we should support both float32 and fixed24 but not using #ifdefs. 
We should make the float32 converter used by default on CPUs with FPUs and 
the fixed24 converter used by default otherwise.
That means we'll also have to program AOUT_FILTERs for both formats, but I 
think it's wise not to forget the embedded "market".
One good point about having both the float32 and fixed24 AOUT_CONVERTERs at 
the same time is for example when a required AOUT_FILTER is only available 
for one of the formats (sometimes it's better to have a slow filter than 
nothing at all).



This is the vlc-devel mailing-list, see http://www.videolan.org/vlc/
To unsubscribe, please read http://www.videolan.org/lists.html
If you are in trouble, please contact <postmaster at videolan.org>

More information about the vlc-devel mailing list