[vlc-devel] [PATCH] Add Scaletempo audio filter

Wed Jun 18 23:26:54 CEST 2008

Hi,

On Mon, Jun 16, 2008, Rov Juvano wrote:
> Scaletempo maintains the audio pitch when playback rate != 1.0 (i.e.
> no chipmunk effect).  This fixes the pitch scaling caused by using the
> resampler to handle playback rate.

> Ported from GStreamer.  Inspired by SoundTouch library by Olli Parviainen.

> +++ b/modules/audio_filter/scaletempo.c
> @@ -0,0 +1,567 @@
> +/*****************************************************************************
> + * scaletempo.c: Scale audio tempo while maintaining pitch
> + *****************************************************************************
> + * Copyright Â© 2008 the VideoLAN team
> + * $Id$
> + *
> + * Authors: Rov Juvano <rovjuvano at users.sourceforge.net>
 As asked by Remi, are you the only author of all the code of scaletempo.c ? (as
you have said  it is "Ported from GStreamer"). If not you should add back the original
authors.

> +static int StrideCallback( vlc_object_t *p_this, char const *psz_cmd,
> +                           vlc_value_t oldval, vlc_value_t newval, void *p_data )
> +{
> +    VLC_UNUSED(p_this); VLC_UNUSED(psz_cmd); VLC_UNUSED(oldval);
> +    aout_filter_sys_t *p = (aout_filter_sys_t *)p_data;
> +    uint new_value = newval.i_int;
> +    if( p->ms_stride != new_value ) {
> +        p->ms_stride = new_value;
> +        p->reinit_buffers = true;
> +    }
> +    return VLC_SUCCESS;
> +}
> +
> +static int OverlapCallback( vlc_object_t *p_this, char const *psz_cmd,
> +                            vlc_value_t oldval, vlc_value_t newval, void *p_data )
> +{
> +    VLC_UNUSED(p_this); VLC_UNUSED(psz_cmd); VLC_UNUSED(oldval);
> +    aout_filter_sys_t *p = (aout_filter_sys_t *)p_data;
> +    float new_value = newval.f_float;
> +    if( p->percent_overlap != new_value ) {
> +        p->percent_overlap = new_value;
> +        p->reinit_buffers = true;
> +    }
> +    return VLC_SUCCESS;
> +}
> +
> +static int SearchCallback( vlc_object_t *p_this, char const *psz_cmd,
> +                           vlc_value_t oldval, vlc_value_t newval, void *p_data )
> +{
> +    VLC_UNUSED(p_this); VLC_UNUSED(psz_cmd); VLC_UNUSED(oldval);
> +    aout_filter_sys_t *p = (aout_filter_sys_t *)p_data;
> +    uint new_value = newval.i_int;
> +    if( p->ms_search != new_value ) {
> +        p->ms_search = new_value;
> +        p->reinit_buffers = true;
> +    }
> +    return VLC_SUCCESS;
> +}
 Those callbacks are not thread safe. They can be called while you are in
DoWork(). You need to add a proper locking scheme. Reading your code, it
seems you just need to lock inside these callbacks and around
 if( p->reinit_buffers ) { reinit_buffers() }
in DoWork.

 You probably want to check thoses value against sane bounds (here or in
reinit_buffers). What happens if ms_stride is 0 for example ?

> +    if( p_filter->input.i_format != VLC_FOURCC('f','l','3','2') )
> +    {
> +        b_fit = false;
> +        p_filter->input.i_format = p_filter->output.i_format = VLC_FOURCC('f','l','3','2');
> +        msg_Warn( p_filter, "bad input or output format" );
> +    }
> +    if( ! AOUT_FMTS_SIMILAR( &p_filter->input, &p_filter->output ) )
> +    {
> +        b_fit = false;
> +        memcpy( &p_filter->output, &p_filter->input, sizeof(audio_sample_format_t) );
> +        msg_Warn( p_filter, "input and output formats are not similar" );
> +    }
> +
> +    if( ! b_fit )
> +    {
> +        return VLC_EGENERIC;
> +    }
 I am not sure you can change p_filter->input/p_filter->output if you
return VLC_EGENERIC.

> +    p_sys = p_filter->p_sys = malloc( sizeof(aout_filter_sys_t) );
> +    if( !p_sys )
> +    {
> +        msg_Err( p_this, "Out of memory" );
> +        return VLC_ENOMEM;
> +    }
 No need to print a message when out of memory. (msg_Err needs memory too ;)

> +    aout_instance_t *p_aout = (aout_instance_t *)p_filter->p_parent;
> +    p_sys->ms_stride       = var_CreateGetInteger( p_aout, "scaletempo-stride" );
> +    p_sys->percent_overlap = var_CreateGetFloat( p_aout,   "scaletempo-overlap" );
> +    p_sys->ms_search       = var_CreateGetInteger( p_aout, "scaletempo-search" );
 I find it weird to create those variables in aout but I don't know if it is
the current policy.

> +    size_t i_outsize = calculate_output_buffer_size ( p_filter, p_in_buf->i_nb_bytes );
> +    if( i_outsize > p_out_buf->i_size ) {
> +        p->p_buffers[ p->i_buf ] = realloc( p->p_buffers[ p->i_buf ], i_outsize );
> +        if( ! p->p_buffers[ p->i_buf ] )
> +        {
> +            msg_Err( VLC_OBJECT(p_filter), "Out of memory" );
> +            return;
> +        }
 No need for "out of mem" message.
 You also have a memory leak when realloc fails (the old buffer is not
released).

> diff --git a/src/audio_output/input.c b/src/audio_output/input.c
> index 184e8f7..fa62463 100644
> --- a/src/audio_output/input.c
> +++ b/src/audio_output/input.c
> @@ -401,6 +401,18 @@ int aout_InputNew( aout_instance_t * p_aout, aout_input_t * p_input )
>      }
>      p_input->i_resampling_type = AOUT_RESAMPLING_NONE;
>  
> +    p_input->p_playback_rate_filter = NULL;
> +    for (int i=0; i < p_input->i_nb_filters; i++) {
> +        aout_filter_t * p_filter = p_input->pp_filters[i];
> +        if (strcmp ("scaletempo", p_filter->psz_object_name) == 0) {
> +          p_input->p_playback_rate_filter = p_filter;
> +          break;
> +        }
> +    }
> +    if (!p_input->p_playback_rate_filter && p_input->i_nb_resamplers > 0) {
> +        p_input->p_playback_rate_filter = p_input->pp_resamplers[0];
> +    }
 Inside src/audio_output I would prefer you to keep current VLC code format
style.
  I think it would be more userfriendly if scaletempo was selected by
default (instead of specifying by audio-filter). The code loading filters
(audio-filter) could be made a function and then reused. Of course, an option
should be added to disable this behavior.

> -        if( p_input->pp_resamplers[0]->input.i_rate == 1000 * p_input->input.i_rate / i_input_rate )
> +        uint i_nominal_rate =
 Use unsigned int instead.

> +/*
> + * Scaletempo works by producing audio in constant sized chunks (a "stride") but
> + * consuming chunks proportional to the playback rate.
> + *
> + * Scaletempo then smooths the output by blending the end of one stride with
> + * the next ("overlap").
> + *
> + * Scaletempo smooths the overlap further by searching within the input buffer
> + * for the best overlap position.  Scaletempo uses a statistical cross correlation
> + * (roughly a dot-product).  Scaletempo consumes most of its CPU cycles here.
> + *
> + * NOTE:
> + * sample: a single audio sample for one channel
> + * frame: a single set of samples, one for each channel
> + * VLC uses these terms differently
> + */
  About the algo, I did find the quality a bit low (especially at high speed).
I have played with audio time-stretching before and I had a better quality if
I did not use "constant sized chunks".
  For that I estimate the period using autocorellation and then I duplicated or
overlapped the audio by chunk equal to the local period. The duplication or
overlapping amount was controled by the scale factor I wanted. It is a bit
more complicated though.

 Anyway the quality should not stop the integration of scaletempo, it is a
nice feature :)

Thank you for your work.

Regards,

-- 
fenrir