[vlc-devel] Re: libiconv problem on internationalized text

Rémi Denis-Courmont courmisch at via.ecp.fr
Fri Aug 26 22:25:25 CEST 2005

Le Vendredi 26 Août 2005 21:33, Christophe Massiot a écrit :
> I have been reported a problem with charsets. It occurs when the
> interface uses another charset than UTF-8, for instance the HTTP
> interface with --http-charset=ISO-8859-1, and I believe text
> interfaces such as rc are also in trouble if the local charset isn't
> UTF-8.

I still don't really see the point of customizing the charset in HTTP. 
If you support HTML, you have to support UTF-8, and even IE does 
support it.

The http interface has gone through three stage as regard encoding :
1/ Do nothing about it. The user agent had to use the same encoding has 
the instance of vlc running the httpd. That was pretty short-sighted.
Later, Bigben added Latin-1 as an hard-coded charset in the XML headers 
so that the pages would validate.
IMHO, that was actually worse because it also broke the http when the 
vlc locale was not Latin 1. In particular, on Windows with the French 
locale, given gettext was configured to output UTF-8, it broke 100% of 
the time.

2/ Pass the locale VLC charset as the charset. That solved the problem 
of User Agents not having the same charset as the VLC httpd. But with 
Latin-1 still hard-coded in the XML stuff, it broke validation. Also, 
it failed to solve the Windows problem.

3/ Put every thing into UTF-8. That just worked fine, AFAIK. The XML 
headers were updated, and every thing went fine on every OSes (unless 
you had a browser even more broken than IE).

So, I'm still missing the point of allowing another charset in http, 
though I assume there's one for you did implement it.

As for rc and ncurses, I would assume the problem is that these 
interfaces have not been ported to the new UTF-8-only core yet, so they 
don't translate their I/O while they should.

> In our po files, written in UTF-8, the character for the single quote
> isn't "'", but a Unicode quote â??. When asking for a conversion to
> Latin-1, iconv fails and says there is no equivalent for the
> character. Thus UTF-8 is output and it looks ugly.

I assume you are referring to the French localization. I'd rather keep 
the nice French quotes. By the way, when I tell iconv to translate 
"«something»" (UTF8) to Latin9, I simply get "something", not something 
with ugly stuff around... so that again point to rc and ncurses lacking 
some ToLocale() calls.

> We have several options :
>  * Use "'" in all our po files

That's ugly.

>  * Fix the libiconv problem
> Libiconv doesn't seem maintained and sam says that switching to
> librecode would have many advantages, including fixing this. Any
> other idea ?

I don't know about librecode. But how is it to be integrated with 
gettext/libintl ?

Rémi Denis-Courmont
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20050826/c4d9ce36/attachment.sig>

More information about the vlc-devel mailing list