[vlc-devel] Re: libiconv problem on internationalized text
Rémi Denis-Courmont
courmisch at via.ecp.fr
Fri Aug 26 22:25:25 CEST 2005
Le Vendredi 26 Août 2005 21:33, Christophe Massiot a écrit :
> I have been reported a problem with charsets. It occurs when the
> interface uses another charset than UTF-8, for instance the HTTP
> interface with --http-charset=ISO-8859-1, and I believe text
> interfaces such as rc are also in trouble if the local charset isn't
> UTF-8.
I still don't really see the point of customizing the charset in HTTP.
If you support HTML, you have to support UTF-8, and even IE does
support it.
The http interface has gone through three stage as regard encoding :
1/ Do nothing about it. The user agent had to use the same encoding has
the instance of vlc running the httpd. That was pretty short-sighted.
Later, Bigben added Latin-1 as an hard-coded charset in the XML headers
so that the pages would validate.
IMHO, that was actually worse because it also broke the http when the
vlc locale was not Latin 1. In particular, on Windows with the French
locale, given gettext was configured to output UTF-8, it broke 100% of
the time.
2/ Pass the locale VLC charset as the charset. That solved the problem
of User Agents not having the same charset as the VLC httpd. But with
Latin-1 still hard-coded in the XML stuff, it broke validation. Also,
it failed to solve the Windows problem.
3/ Put every thing into UTF-8. That just worked fine, AFAIK. The XML
headers were updated, and every thing went fine on every OSes (unless
you had a browser even more broken than IE).
So, I'm still missing the point of allowing another charset in http,
though I assume there's one for you did implement it.
As for rc and ncurses, I would assume the problem is that these
interfaces have not been ported to the new UTF-8-only core yet, so they
don't translate their I/O while they should.
> In our po files, written in UTF-8, the character for the single quote
> isn't "'", but a Unicode quote â??. When asking for a conversion to
> Latin-1, iconv fails and says there is no equivalent for the
> character. Thus UTF-8 is output and it looks ugly.
I assume you are referring to the French localization. I'd rather keep
the nice French quotes. By the way, when I tell iconv to translate
"«something»" (UTF8) to Latin9, I simply get "something", not something
with ugly stuff around... so that again point to rc and ncurses lacking
some ToLocale() calls.
> We have several options :
> * Use "'" in all our po files
That's ugly.
> * Fix the libiconv problem
>
> Libiconv doesn't seem maintained and sam says that switching to
> librecode would have many advantages, including fixing this. Any
> other idea ?
I don't know about librecode. But how is it to be integrated with
gettext/libintl ?
--
Rémi Denis-Courmont
http://www.simphalempin.com/home/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20050826/c4d9ce36/attachment.sig>
More information about the vlc-devel
mailing list