[vlc-devel] Re: Non-western character encoding
Rémi Denis-Courmont
rem at videolan.org
Sun Mar 12 16:45:04 CET 2006
Le Dimanche 12 Mars 2006 15:49, Måns Rullgård a écrit :
> How is "the local character encoding" determined?
It comes from LC_ALL, LC_CTYPE or LANG. The mapping is
in /usr/share/i18n/SUPPORTED.
> If LC_ALL, LC_CTYPE or LANG (checked in that order) specifies an
> encoding, that should be used. If none is specified, the best that
> can be done is to choose a default for each locale. The user should
> always have an options to override the default should s/he wish to.
I have to disagree here. I don't believe japanese subtitles
automagically change from Shift-JIS to EUC-JP as they are downloaded on
a Linux system. Japanese Windows users use CP932 variant of Shift-JIS,
so Japanese subtitles are in Shift-JIS/CP932. There is no point in
trying to decode these as EUC-JP, even if that is the encoding for the
ja_JP C library locale on Linux.
And I *know* that French subtitles don't automagically get converted
from Latin-1/CP1252 to UTF-8 juste because my system's LC_CTYPE is
fr_FR.UTF-8 instead of fr_FR.
What I do believe is that we get a much bigger rate of matching encoding
by looking at the local system language (ie. the first part of LC_ALL
or LANG), rather than by using the local system charset. In fact, it
makes almost all subtitles work, while they would otherwise almost
always fail. If you don't believe me, just try to use subtitles from
some western language that has lots of accents (French, German,
Swedish...) with a pre-[14724] VLC on a Linux system using a UTF-8 (as
in LANG=??_??.UTF-8) locale variant for said language. And feel the
pain.
The current approach is just utterly broken (except on Windows). The
proposed approach brings VLC subtitles decoding on Linux & company to
the same, much higher, “success” rate of VLC on Windows.
And then, we might also consider UTF-8 autodetection (à la
irssi>=0.8.10), though I'm yet to find any UTF-8 subtitle file.
--
Rémi Denis-Courmont
http://www.simphalempin.com/home/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20060312/404cce98/attachment.sig>
More information about the vlc-devel
mailing list