[vlc-devel] [PATCH 2/2] taglib: detect charset when ID3v2 Latin-1 parser finds invalid character

Rémi Denis-Courmont remi at remlab.net
Sat Oct 24 11:43:45 CEST 2020


Le perjantaina 23. lokakuuta 2020, 13.54.40 EEST Francois Cartegnie a écrit :
> Le 23/10/2020 à 12:46, sojulibra at gmail.com a écrit :
> > From: Souju TANAKA <sojulibra at gmail.com>
> > 
> > Changed TagLib Latin-1 parser to check whether a ISO 8859-1 encoded ID3v2
> > tag is a valid byte sequence. If invalid Latin-1 character is found, try
> > to detect charset and convert the tag into UTF-8 to avoid Mojibake.
> > 
> > Some encoder embeds ID3v2 in unexpected charset, though it is againt the
> > spec. TagLib allows to overide
> > TagLib::ID3v2::Latin1StringHandler::parse() to deal with this practical
> > situation.
> 
> we already provide a dedicated function to fix utf8 encodings

The input is expected in Latin-1, not UTF-8. So I don't understand how the 
existing UTF-8 validation functions would help here?

-- 
レミ・デニ-クールモン
http://www.remlab.net/





More information about the vlc-devel mailing list