[vlc-devel] [RFC] EIT character sets conversion

Rémi Denis-Courmont rem at videolan.org
Thu Aug 30 23:44:21 CEST 2007


I have a few doubts concerning EITConvertToUTF8 (from 
modules/demux/ts.c). I have no access to the relevant specifications, 
neither to real-life streams using that.

First, if the "string" starts with \x10\x00, it appears we assume the 
third byte codes the number of an ISO_8859 character set. Is there any 
reason why this is limited to the range 1-15? As of now, there is also 
ISO_8859-16 (a.k.a. "Latin-10"), and who knows if more will not be 

Second, if the string starts with \x11, we assume the rest is a sequence 
of UTF-16. That being noted, iconv reckons three different kind of 
UTF-16. I am not sure, but I believe "UTF-16" needs a Byte-Order-Mark at 
the beginning, otherwise "UTF-16LE" and "UTF16-BE" must be used when 
the byte endianess is arbitrarily specified.

Help wanted.

Rémi Denis-Courmont
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20070831/bb6bfe4f/attachment.sig>

More information about the vlc-devel mailing list