[vlc-devel] Need help developing text-to-speech-module in VLC - offering £5.000

Juha Jeronen juha.jeronen at jyu.fi
Thu Apr 7 12:58:43 CEST 2011


Hi,

A brief remark here...

On Apr 7, 2011, at 13:34 , Rémi Denis-Courmont wrote:

>>> And last, you need to filter out the original voice from the original
>>> audio track. Or do you not mind loosing the original audio sound
>>> effects? I
>>> don't suppose you will always have a clear/speech-less audio channel
>>> available in the original media.
> 
>> We were hoping to be able to preserve the original sound, but to
> evaluate
>> the finished module, we are interested in letting the users set their
> own
>> preferences in terms of volume for the different channels. Do you think
>> this would be possible? If not, we'd liked it filtered out but not mute.
> 
> Balancing or remixing channels is feasible, except that VLC has no user
> interface concept for this at the moment.
> 
> However, sound effects and voices are on the same channels in the original
> medium. So I meant it's difficult to remove the original voices while
> keeping the original sound effects.

With all stereo sources, yes, exactly.

With 5.1 sources, on the other hand, it may be possible to get the speech track only, from the center channel. But this only applies to a small minority of videos. (Hollywood movies, mainly? From my personal experience, I can say that at least almost all Japanese animation is stereo only.)

Even for 5.1 sources, there is at least one complication. Many computers only have a digital audio connector for outputting 5.1. Usually, S/PDIF passthrough is used for viewing DVDs with 5.1 audio in such setups. But due to the requirements of S/PDIF, the audio must be encoded in A52 or DTS format, like it is on the DVD.

Thus, one would need to 1) decode the original sound, 2) rewrite the speech track on the fly, and 3) immediately re-encode. Personally I don't know of the status of A52 encoders, whether there is an open source implementation available. Also, it needs to be fast enough for realtime use to be applicable for this approach.

Downmixing a 5.1 source to stereo may be easier. In this case, one needs only to do steps 1) and 2) above, and then apply the downmix (and handle the resulting stereo audio as usual).

 -J




More information about the vlc-devel mailing list