[vlc-devel] [VLC] #3054: Default charset for EPG should be ISO-6937

Marian Ďurkovič md at bts.sk
Wed Sep 2 08:43:46 CEST 2009


On Tue, Aug 18, 2009 at 04:38:21PM +0200, Rémi Denis-Courmont wrote:
> 
> On Mon, 17 Aug 2009 16:32:23 +0200, Rémi Denis-Courmont <remi at remlab.net>
> wrote:
> >>  Indeed, all slovak and czech TV stations use ISO-6937 and don't work
> >>  correctly now.
> > 
> > IIRC, many Western TV channels are broken, as in they use Latin-1. I
> > currently lack any DVB source, so I cannot check.
> > What should be do?
> 
> How about trying ISO_6937 and falling back to ISO_8859-1 if that fails? It
> seems that most accentuated Latin-1 characters are invalid in ISO_6937
> which is a multi-byte encoding.

I just tested this on one broken channel. Unfortunately, latin-1 texts
almost never fail iconv when it's tried with iso-6937 first. Here is
an example what we might get:

'Un florilŁge d'extraits musicaux, une faĿon diffØrente et extrŒmement variØe
de dØcouvrir la musique, les interprŁtes et les orchestres.'

Thus we probably need a config option named "Ignore the DVB standard and
use xxx charset for EPG". This would be similar to VDR approach:
http://www.linuxtv.org/pipermail/vdr/2008-March/016277.html

BTW, my other patch (Provide charset detection also for SDT fields)
was still not commited to git, this one is needed to properly decode
station names if they use accentuated characters.


    Thanks & kind regards,

         M.

    



More information about the vlc-devel mailing list