[vlc-devel] [PATCH] Add Scaletempo audio filter

Thu Jun 19 14:09:19 CEST 2008

On Wed, Jun 18, 2008 at 11:26:54PM +0200, Laurent Aimar wrote:
>  As asked by Remi, are you the only author of all the code of scaletempo.c ? (as
> you have said  it is "Ported from GStreamer"). If not you should add back the original
> authors.

Sole author.  As well as for the MPlayer version under a
different name, which is stuck in svn due to a code freeze
just before submitting the patch.  Darn my timing.

> 
> > +static int StrideCallback( vlc_object_t *p_this, char const *psz_cmd,
> > +static int OverlapCallback( vlc_object_t *p_this, char const *psz_cmd,
> > +static int SearchCallback( vlc_object_t *p_this, char const *psz_cmd,
> > +                           vlc_value_t oldval, vlc_value_t newval, void *p_data )
> > +{
> > +    VLC_UNUSED(p_this); VLC_UNUSED(psz_cmd); VLC_UNUSED(oldval);
> > +    aout_filter_sys_t *p = (aout_filter_sys_t *)p_data;
> > +    uint new_value = newval.i_int;
> > +    if( p->ms_search != new_value ) {
> > +        p->ms_search = new_value;
> > +        p->reinit_buffers = true;
> > +    }
> > +    return VLC_SUCCESS;
> > +}
>  Those callbacks are not thread safe. They can be called while you are in
> DoWork(). You need to add a proper locking scheme. Reading your code, it
> seems you just need to lock inside these callbacks and around
>  if( p->reinit_buffers ) { reinit_buffers() }
> in DoWork.

Because these values are independent of each other, are used
only within reinit_buffers, and then used only once, the worse
case scenario would be a delay for them taking effect, right?

> 
>  You probably want to check thoses value against sane bounds (here or in
> reinit_buffers). What happens if ms_stride is 0 for example ?

I thought they wouldn't be called if the values were out
bounds.  Are the min/max of add_*_with_range just informative?

> 
> > +    if( p_filter->input.i_format != VLC_FOURCC('f','l','3','2') )
> > +    {
> > +        b_fit = false;
> > +        p_filter->input.i_format = p_filter->output.i_format = VLC_FOURCC('f','l','3','2');
> > +        msg_Warn( p_filter, "bad input or output format" );
> > +    }
> > +    if( ! AOUT_FMTS_SIMILAR( &p_filter->input, &p_filter->output ) )
> > +    {
> > +        b_fit = false;
> > +        memcpy( &p_filter->output, &p_filter->input, sizeof(audio_sample_format_t) );
> > +        msg_Warn( p_filter, "input and output formats are not similar" );
> > +    }
> > +
> > +    if( ! b_fit )
> > +    {
> > +        return VLC_EGENERIC;
> > +    }
>  I am not sure you can change p_filter->input/p_filter->output if you
> return VLC_EGENERIC.

The input and output formats should be the same.  Did I do
something to change them?

> 
> > +    p_sys = p_filter->p_sys = malloc( sizeof(aout_filter_sys_t) );
> > +    if( !p_sys )
> > +    {
> > +        msg_Err( p_this, "Out of memory" );
> > +        return VLC_ENOMEM;
> > +    }
>  No need to print a message when out of memory. (msg_Err needs memory too ;)

Duh.  I saw this other places but should have known better.

> 
> > +    aout_instance_t *p_aout = (aout_instance_t *)p_filter->p_parent;
> > +    p_sys->ms_stride       = var_CreateGetInteger( p_aout, "scaletempo-stride" );
> > +    p_sys->percent_overlap = var_CreateGetFloat( p_aout,   "scaletempo-overlap" );
> > +    p_sys->ms_search       = var_CreateGetInteger( p_aout, "scaletempo-search" );
>  I find it weird to create those variables in aout but I don't know if it is
> the current policy.

I saw other filters do it.  This allows dynamically changing
the parameters.  It's unnecessary, so I can change them to
config_Get...

> 
> > +        p->p_buffers[ p->i_buf ] = realloc( p->p_buffers[ p->i_buf ], i_outsize );
>  You also have a memory leak when realloc fails (the old buffer is not
> released).

Good catch.  I did this a number of times.

> 
> > diff --git a/src/audio_output/input.c b/src/audio_output/input.c
> > index 184e8f7..fa62463 100644
> > --- a/src/audio_output/input.c
> > +++ b/src/audio_output/input.c
> > @@ -401,6 +401,18 @@ int aout_InputNew( aout_instance_t * p_aout, aout_input_t * p_input )
> >      }
> >      p_input->i_resampling_type = AOUT_RESAMPLING_NONE;
> >  
> > +    p_input->p_playback_rate_filter = NULL;
> > +    for (int i=0; i < p_input->i_nb_filters; i++) {
> > +        aout_filter_t * p_filter = p_input->pp_filters[i];
> > +        if (strcmp ("scaletempo", p_filter->psz_object_name) == 0) {
> > +          p_input->p_playback_rate_filter = p_filter;
> > +          break;
> > +        }
> > +    }
> > +    if (!p_input->p_playback_rate_filter && p_input->i_nb_resamplers > 0) {
> > +        p_input->p_playback_rate_filter = p_input->pp_resamplers[0];
> > +    }
>  Inside src/audio_output I would prefer you to keep current VLC code format
> style.

I meant to do that.

>   I think it would be more userfriendly if scaletempo was selected by
> default (instead of specifying by audio-filter). The code loading filters
> (audio-filter) could be made a function and then reused. Of course, an option
> should be added to disable this behavior.

If there's consensus then, yeah.  I don't know how to make
it the default.

> 
> > -        if( p_input->pp_resamplers[0]->input.i_rate == 1000 * p_input->input.i_rate / i_input_rate )
> > +        uint i_nominal_rate =
>  Use unsigned int instead.

Instead of what?  Isn't it an unsigned int?

> 
> > +/*
> > + * Scaletempo works by producing audio in constant sized chunks (a "stride") but
> > + * consuming chunks proportional to the playback rate.
> > + *
> > + * Scaletempo then smooths the output by blending the end of one stride with
> > + * the next ("overlap").
> > + *
> > + * Scaletempo smooths the overlap further by searching within the input buffer
> > + * for the best overlap position.  Scaletempo uses a statistical cross correlation
> > + * (roughly a dot-product).  Scaletempo consumes most of its CPU cycles here.
> > + *
> > + * NOTE:
> > + * sample: a single audio sample for one channel
> > + * frame: a single set of samples, one for each channel
> > + * VLC uses these terms differently
> > + */
>   About the algo, I did find the quality a bit low (especially at high speed).

I don't doubt that, if you have good ears.

I find the VLC version produces the worse sound because
aout_InputPlay doesn't seem to like filters buffering/skipping,
which can get relatively large at high speeds.  msg_Dbg
reports a lot of dropped buffers and drift correction.  That's
not something I can fix.  Maybe there is some way to let
aout_InputPlay know about the buffering, but I didn't figure
it out.

You can see if it sounds better with one of the versions at
http://scaletempo.sourceforge.net/

> I have played with audio time-stretching before and I had a better quality if
> I did not use "constant sized chunks".
>   For that I estimate the period using autocorellation and then I duplicated or
> overlapped the audio by chunk equal to the local period. The duplication or
> overlapping amount was controled by the scale factor I wanted. It is a bit
> more complicated though.

I'd like to see more, code would be nice.

> 
>  Anyway the quality should not stop the integration of scaletempo, it is a
> nice feature :)

Scratching an itch.

> 
> Thank you for your work.

Right back at you.  You laid the groundwork.

--
rovjuvano