[vlc-commits] input: do not override subtitles encoding if BOM is found (fixes #5239)

Rémi Denis-Courmont git at videolan.org
Wed Jun 27 15:03:01 CEST 2012


vlc | branch: master | Rémi Denis-Courmont <remi at remlab.net> | Wed Jun 27 15:58:24 2012 +0300| [df908387e20e71d92a9f9877d1fa69c9c95ba28c] | committer: Rémi Denis-Courmont

input: do not override subtitles encoding if BOM is found (fixes #5239)

That horrible hack caused all subtitles to be parsed as Unicode if one
(but not necessarily all) opened subtitles started with a UTF-8 or
UTF-16 Byte Order Mark. If any other subtitle was neither in UTF-8 nor
in UTF-16 with a BOM, that hack failed.

> http://git.videolan.org/gitweb.cgi/vlc.git/?a=commit;h=df908387e20e71d92a9f9877d1fa69c9c95ba28c
---

 src/input/stream.c |   32 +++++++-------------------------
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/src/input/stream.c b/src/input/stream.c
index 1a3471c..67c0297 100644
--- a/src/input/stream.c
+++ b/src/input/stream.c
@@ -1485,46 +1485,28 @@ char *stream_ReadLine( stream_t *s )
 
         /* BOM detection */
         i_pos = stream_Tell( s );
-        if( i_pos == 0 && i_data >= 3 )
+        if( i_pos == 0 && i_data >= 2 )
         {
             const char *psz_encoding = NULL;
 
-            if( !memcmp( p_data, "\xEF\xBB\xBF", 3 ) )
-            {
-                psz_encoding = "UTF-8";
-            }
-            else if( !memcmp( p_data, "\xFF\xFE", 2 ) )
+            if( !memcmp( p_data, "\xFF\xFE", 2 ) )
             {
                 psz_encoding = "UTF-16LE";
                 s->p_text->b_little_endian = true;
-                s->p_text->i_char_width = 2;
             }
             else if( !memcmp( p_data, "\xFE\xFF", 2 ) )
             {
                 psz_encoding = "UTF-16BE";
-                s->p_text->i_char_width = 2;
             }
 
             /* Open the converter if we need it */
             if( psz_encoding != NULL )
             {
-                msg_Dbg( s, "%s BOM detected", psz_encoding );
-                if( s->p_text->i_char_width > 1 )
-                {
-                    s->p_text->conv = vlc_iconv_open( "UTF-8", psz_encoding );
-                    if( s->p_text->conv == (vlc_iconv_t)-1 )
-                    {
-                        msg_Err( s, "iconv_open failed" );
-                    }
-                }
-
-                /* FIXME that's UGLY */
-                input_thread_t *p_input = s->p_input;
-                if( p_input != NULL)
-                {
-                    var_Create( p_input, "subsdec-encoding", VLC_VAR_STRING | VLC_VAR_DOINHERIT );
-                    var_SetString( p_input, "subsdec-encoding", "UTF-8" );
-                }
+                msg_Dbg( s, "UTF-16 BOM detected" );
+                s->p_text->i_char_width = 2;
+                s->p_text->conv = vlc_iconv_open( "UTF-8", psz_encoding );
+                if( s->p_text->conv == (vlc_iconv_t)-1 )
+                    msg_Err( s, "iconv_open failed" );
             }
         }
 



More information about the vlc-commits mailing list