[vlc-devel] [patch] modules/codec/subsdec.c: bugfix for stripping HTML tags in subtitles

Christian Hammers ch at lathspell.de
Mon Jan 24 02:57:41 CET 2005


Hello

The StripTags() function in modules/codec/subsdec.c from 0.8.1 (and 
subversion trunk) is broken.

I have a .srt subtitle file with HTML tags in it. The StripTags() function
is called on these lines but strips only the second and further tags,
not the first one i.e.
	<font color=#ff0000>hello world</font>
becomes                               
	<font color=#ff0000>hello world
and not                                ^^^^
	hello world

My version also does not "end" a tag when spaces or newlines are
encountered. That was necessary for those HTML formatted subtitles.

Storing the HTML information to generate coloured, bold and italic
subtitles would be cool, too but that means a bit more work, I fear.

bye,

-christian-


static void StripTags( char *psz_text )
{
    vlc_bool_t b_inside_tag = VLC_FALSE;
    int i_read = 0;
    int i_write = 0;

    /* Everything between '<' and '>' is stripped. Newlines, tabs and spaces
     * are allowed inside tags to allow multiline HTML input. */
    while( psz_text[ i_read ] )
    {
        if( b_inside_tag )
        {
            if( psz_text[ i_read ] == '>' )
            {
                b_inside_tag = VLC_FALSE;
            }
        } else {
            if( psz_text[ i_read ] == '<' )
            {
                b_inside_tag = VLC_TRUE;
            }
            else
            {
                psz_text[ i_write ] = psz_text[ i_read ];
                i_write++;
            }
        }
        i_read++;
    }

    psz_text[ i_write ] = '\0';
}

-- 
This is the vlc-devel mailing-list, see http://www.videolan.org/vlc/
To unsubscribe, please read http://developers.videolan.org/lists.html



More information about the vlc-devel mailing list