[vlc-commits] stream_ReadLine: properly discard incomplete UTF-16 sequences at EOF

Pierre Ynard git at videolan.org
Tue Oct 27 09:10:14 CET 2020


vlc | branch: master | Pierre Ynard <linkfanel at yahoo.fr> | Tue Oct 27 08:54:24 2020 +0100| [65f6b1b719fffcdc345cf45d8bce63675cccf3a4] | committer: Pierre Ynard

stream_ReadLine: properly discard incomplete UTF-16 sequences at EOF

Lone-byte incomplete UTF-16 sequences before EOF, in some cases such as
a final line consisting only of it, would never get actually consumed
from the stream, preventing it from ever properly reaching EOF.

This also avoids flooding the logs with one warning per stream line
towards the end of the stream, and then printing an unspecific
conversion error: those are replaced by one clear and explicit error
message.

> http://git.videolan.org/gitweb.cgi/vlc.git/?a=commit;h=65f6b1b719fffcdc345cf45d8bce63675cccf3a4
---

 src/input/stream.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/input/stream.c b/src/input/stream.c
index 90278ea59d..9cba64697f 100644
--- a/src/input/stream.c
+++ b/src/input/stream.c
@@ -245,15 +245,22 @@ char *vlc_stream_ReadLine( stream_t *s )
             }
         }
 
-        if( i_data % priv->text.char_width )
+        /* Deal here with lone-byte incomplete UTF-16 sequences at EOF
+           that we won't be able to process anyway */
+        if( i_data < priv->text.char_width )
         {
-            /* keep i_char_width boundary */
-            i_data = i_data - ( i_data % priv->text.char_width );
-            msg_Warn( s, "the read is not i_char_width compatible");
+            assert( priv->text.char_width == 2 );
+            uint8_t inc;
+            ssize_t i_inc = vlc_stream_Read( s, &inc, priv->text.char_width );
+            assert( i_inc == i_data );
+            if( i_inc > 0 )
+                msg_Err( s, "discarding incomplete UTF-16 sequence at EOF: 0x%02x", inc );
+            break;
         }
 
-        if( i_data == 0 )
-            break;
+        /* Keep to text encoding character width boundary */
+        if( i_data % priv->text.char_width )
+            i_data = i_data - ( i_data % priv->text.char_width );
 
         /* Check if there is an EOL */
         if( priv->text.char_width == 1 )
@@ -313,10 +320,10 @@ char *vlc_stream_ReadLine( stream_t *s )
 
         /* Read data (+1 for easy \0 append) */
         p_line = realloc_or_free( p_line,
-                          i_line + STREAM_PROBE_LINE + priv->text.char_width );
+                        i_line + i_data + priv->text.char_width );
         if( !p_line )
             goto error;
-        i_data = vlc_stream_Read( s, &p_line[i_line], STREAM_PROBE_LINE );
+        i_data = vlc_stream_Read( s, &p_line[i_line], i_data );
         if( i_data <= 0 ) break; /* Hmmm */
         i_line += i_data;
         i_read += i_data;



More information about the vlc-commits mailing list