[vlc-devel] Need help developing text-to-speech-module in VLC - offering £5.000

Sandra Derbring sandra.derbring at gmail.com
Thu Apr 7 11:59:14 CEST 2011

Hi Rémi,

Thank you so much for replying. I'll try to address your points as much as
I'm capable to.

Voice can be synthesized from plain text directly, and from rich text at
> the cost of loosing the "enrichment". But to read bitmaps, you would need
> optical characters recognition. If that is required (especially for DVD
> subtitles playback), there may be a very hard problem. As for "burnt"
> subtitles, ouch, I guess OCR would be very hard and unreliable.
We have already opted out of using bitmaps or burnt subtitles, just because
that would require so much more work and in the end probably still be
unreliable. Our idea is to let the user download subtitles from the web,
that is plain text in *.srt format. The video files, though, will come from
DVDs in the evaluation study we will be doing, and not be downloaded.

> Then you need a speech synthesis engine. There are quite a few open-source
> ones. But I don't know which ones, if any, supports Swedish phonetics and
> have a GPL-compatible copyright license. Did you already sort out that part
> of the equation?

Since we work with users who probably will having a license for a speech
synthesis engine already, we were hoping that the module could have support
for a few of the most usual engines, both licensed and free, and a function
for checking which is on the user's system (or alternatively, letting the
user state this). In terms of free vs licensing, will this be a problem? On
top of my head, I can think of espeak (not very good) and Festival,
supporting Swedish phonetics, but I am sure there are more. The basic idea
is that the engine wouldn't come with the module, but work with the one
already in place on the user's system (if applicable).

> And last, you need to filter out the original voice from the original
> audio track. Or do you not mind loosing the original audio sound effects? I
> don't suppose you will always have a clear/speech-less audio channel
> available in the original media.

We were hoping to be able to preserve the original sound, but to evaluate
the finished module, we are interested in letting the users set their own
preferences in terms of volume for the different channels. Do you think this
would be possible? If not, we'd liked it filtered out but not mute.

> It depends a lot on the requirements, and what existing components could
> be sourced or would be provided by you.

I'll be happy for more questions, or pointers, that would help us understand
what resources we need, and who could provide them. Given these premises, do
you have a more clear picture of what effort we could be talking about?

All the best,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20110407/090d894a/attachment.html>

More information about the vlc-devel mailing list