[vlc-devel] Need help developing text-to-speech-module in VLC - offering £5.000

Thu Apr 7 15:28:12 CEST 2011

Hi and thanks for all your replies!

Cost is not an issue. However GNU GPL compatibility could be an issue. To
> "talk" to a closed-source or other "GPL-incompatible" speech engine, some
> extra care must be taken to cleanly delineate the VLC open-source software
> parts from the proprietary parts.
>
>
I am not aware of a standard programming interface for speech synthesis.
> If there is none, then a separate "glue" plug-in needs to be written for
> each and every engine that is to be supported. So the more engines, the
> more work.

> I guess an audio_filter module would be written against Festival APIs,
> or something of the like and depending of the presence (or not) of the
> right dll|so|dylib libraries, the module would be possible to load or
> not.
>
>
Ok, so it sounds to me like the best approach is to let the module be
written against one free speech synthesis that has support for Swedish, at
least to begin with. I guess that you could later add support for specific
engines. I was looking more closely on Festival and I can't to seem to find
any proof that they support Swedish. Does anyone know this for sure - or
know about one who does? I took a look at MARY, that Adam suggested, and it
looks like a good project. However, developing a Swedish voice isn't within
the scope of our project, so we need to find an engine that already supports
Swedish (although I must admit there was a really great guide on how to
develop a voice, something worth trying out).

> Balancing or remixing channels is feasible, except that VLC has no user
> interface concept for this at the moment.
>
> However, sound effects and voices are on the same channels in the original
> medium. So I meant it's difficult to remove the original voices while
> keeping the original sound effects.

I see what you mean here. I think the best option would be to filter the
original sound - voices and sound effects - so it won't disturb during
dialogue, but still, during parts of the movie where nobody speaks, you
could still hear background noise, music and sound effects. During
developing/evaluation stages, the users could provide feedback for different
settings even if the developer has to actually carry them out. Thanks for
all the tips and features you all have suggested in this regard.

> As long as OCR is not involved, it seems reasonable.
>

That's great. Do you have an approximation of the time that needs to be put
into it? A month's programming? More/less?

The reason I ask on this list, is that we'd like to incorporate it directly
into VLC. My assumption was that it would be a easier to get the module
accepted and used if it was developed by any of you guys. But maybe I am
wrong? Would there be a difference if we let any regular programmer do
this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20110407/043abdb4/attachment.html>