<div dir="ltr"><div>I have investigated the use of Pango in VLC, and it seems simpler than I thought.</div><div>Pango can use FreeType2 as a backend, so we can use it directly in our freetype module. Here is a first attempt at that.</div><div><br></div><div>I have done my best to modify as little as possible and keep the modifications optional. This is probably sub-optimal. We're using the high level api of Pango (PangoLayout) for shaping only. Pango can handle the rendering for us too. And I think using PangoCairo (probably in a separate module) would prove more efficient and less buggy. Since after setting the text for the layout we can render everything as a whole to a Cairo surface, instead of iterating through lines and runs and glyphs.</div><div><br></div><div>Anyway, there's still work to be done here regarding error handling and cleanup, some styles like underline/strikethrough, karaoke, etc. Shadow and outline. And I still have to see how to modify contribs. I'm not even sure I have edited <a href="http://configure.ac">configure.ac</a> properly. I'm sorry if I've messed anything up. I'm really not much of a programmer.</div><div><br></div><div>I also haven't figured out yet how to use font attachments with Pango. But the good thing is that Pango now performs font fallback for us. Try setting the font to Arial Black, which doesn't have any Arabic glyphs for example, and Arabic subtitles would still show correctly.</div><div><br></div><div>It also fixes another issue I noticed recently, regarding RTL text in general. You see we're rendering RTL text from left to right, which is fine since the strings are reversed (by fribidi I suppose). The problem occurs when the lines wrap, so it's the first part of the text that gets moved to the next line. And if there are 2 physical lines and they both wrap the reading order becomes something like:</div><div><br></div><div>2</div><div>1</div><div>4</div><div>3</div><div><br></div><div>So Pango handles that for us, as well as glyph positioning (base glyphs and diacritics). Here's a little patch to render diacritics in red and 2 screenshots for comparison:</div><div><br></div><div><a href="https://docs.google.com/file/d/0B36ioujDBJZsR28ydHBRWjc5Nm8/edit">https://docs.google.com/file/d/0B36ioujDBJZsR28ydHBRWjc5Nm8/edit</a></div><div><a href="https://docs.google.com/file/d/0B36ioujDBJZsNTVLTDVnaFpNSjA/edit">https://docs.google.com/file/d/0B36ioujDBJZsNTVLTDVnaFpNSjA/edit</a></div><div><a href="https://docs.google.com/file/d/0B36ioujDBJZsLXFVWG0xa0pKRVk/edit">https://docs.google.com/file/d/0B36ioujDBJZsLXFVWG0xa0pKRVk/edit</a></div><div><br></div><div>The red arrows show an example of glyph substitution: 2 diacritic glyphs are replaced with a single glyph by Pango according to OpenType tables. The first one is not unreadable, but the second is the official way to render this combination of diacritics. The blue arrow shows an example of wrong wrapping of RTL text.</div><div><br></div><div>So what do you guys think? Should we go with PangoFT2 or PangoCairo? Same module or a separate one?</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 21, 2015 at 3:26 AM, Salah-Eddin Shaban <span dir="ltr"><<a href="mailto:salshaaban@gmail.com" target="_blank">salshaaban@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">---<br>
 <a href="http://configure.ac" target="_blank">configure.ac</a>                     |  11 ++<br>
 modules/text_renderer/freetype.c | 297 +++++++++++++++++++++++++++++++++++++++<br>
 2 files changed, 308 insertions(+)<br>
<br>
diff --git a/<a href="http://configure.ac" target="_blank">configure.ac</a> b/<a href="http://configure.ac" target="_blank">configure.ac</a><br>
index 516e9c7..214633f 100644<br>
--- a/<a href="http://configure.ac" target="_blank">configure.ac</a><br>
+++ b/<a href="http://configure.ac" target="_blank">configure.ac</a><br>
@@ -3117,6 +3117,8 @@ AC_ARG_ENABLE(fribidi,<br>
   [  --enable-fribidi        fribidi support    (default auto)])<br>
 AC_ARG_ENABLE(fontconfig,<br>
   [  --enable-fontconfig     fontconfig support (default auto)])<br>
+AC_ARG_ENABLE(pangoft,<br>
+  [  --enable-pangoft        pangoft support    (default auto)])<br>
<br>
 AC_ARG_WITH([default-font],<br>
     AS_HELP_STRING([--with-default-font=PATH],<br>
@@ -3175,6 +3177,15 @@ if test "${enable_freetype}" != "no"; then<br>
         ],[AC_MSG_WARN([${FRIBIDI_PKG_ERRORS}. Bidirectional support will be disabled in FreeType.])])<br>
       fi<br>
<br>
+      dnl pangoft support<br>
+      if test "${enable_pangoft}" != "no"; then<br>
+        PKG_CHECK_MODULES(PANGOFT, pangoft2, [<br>
+          VLC_ADD_CPPFLAGS([freetype], [${PANGOFT_CFLAGS} -DHAVE_PANGOFT])<br>
+          VLC_ADD_LIBS([freetype], [${PANGOFT_LIBS}])<br>
+        ],[AC_MSG_WARN([${PANGOFT_PKG_ERRORS}. Support for complex scripts (Arabic, Farsi, Thai, etc.) will be limited in FreeType.])])<br>
+      fi<br>
+<br>
+<br>
   ],[<br>
   have_freetype=no<br>
   AS_IF([test -n "${enable_freetype}"],[<br>
diff --git a/modules/text_renderer/freetype.c b/modules/text_renderer/freetype.c<br>
index 050ace6..cde3a37 100644<br>
--- a/modules/text_renderer/freetype.c<br>
+++ b/modules/text_renderer/freetype.c<br>
@@ -61,6 +61,13 @@<br>
 # include <fribidi/fribidi.h><br>
 #endif<br>
<br>
+/* Complex Scripts */<br>
+#if defined(HAVE_PANGOFT)<br>
+# include <pango/pango.h><br>
+# include <pango/pangoft2.h><br>
+# include <glib.h><br>
+#endif<br>
+<br>
 /* apple stuff */<br>
 #ifdef __APPLE__<br>
 # include <TargetConditionals.h><br>
@@ -138,6 +145,10 @@ static const char *const ppsz_sizes_text[] = {<br>
 #define YUVP_LONGTEXT N_("This renders the font using \"paletized YUV\". " \<br>
   "This option is only needed if you want to encode into DVB subtitles" )<br>
<br>
+#define PANGO_TEXT N_("Use Pango for text layout")<br>
+#define PANGO_LONGTEXT N_("Use Pango to layout the text. Required for " \<br>
+  "complex scripts (Arabic, Thai, etc.)")<br>
+<br>
 static const int pi_color_values[] = {<br>
   0x00000000, 0x00808080, 0x00C0C0C0, 0x00FFFFFF, 0x00800000,<br>
   0x00FF0000, 0x00FF00FF, 0x00FFFF00, 0x00808000, 0x00008000, 0x00008080,<br>
@@ -231,6 +242,10 @@ vlc_module_begin ()<br>
<br>
     add_bool( "freetype-yuvp", false, YUVP_TEXT,<br>
               YUVP_LONGTEXT, true )<br>
+#ifdef HAVE_PANGOFT<br>
+    add_bool( "freetype-pango", false, PANGO_TEXT,<br>
+              PANGO_LONGTEXT, true )<br>
+#endif<br>
     set_capability( "text renderer", 100 )<br>
     add_shortcut( "text" )<br>
     set_callbacks( Create, Destroy )<br>
@@ -271,6 +286,9 @@ struct line_desc_t<br>
  *****************************************************************************/<br>
 struct filter_sys_t<br>
 {<br>
+#ifdef HAVE_PANGOFT<br>
+    PangoFontMap  *p_fontmap;<br>
+#endif<br>
     FT_Library     p_library;   /* handle to library     */<br>
     FT_Face        p_face;      /* handle to face object */<br>
     FT_Stroker     p_stroker;   /* handle to path stroker object */<br>
@@ -1148,6 +1166,268 @@ static void BBoxEnlarge( FT_BBox *p_max, const FT_BBox *p )<br>
     p_max->yMax = __MAX(p_max->yMax, p->yMax);<br>
 }<br>
<br>
+#ifdef HAVE_PANGOFT<br>
+static int ProcessLinesPangoFT2( filter_t *p_filter,<br>
+                                 line_desc_t **pp_lines,<br>
+                                 FT_BBox *p_bbox,<br>
+                                 int *pi_max_face_height,<br>
+                                 uni_char_t *psz_text,<br>
+                                 text_style_t **pp_styles,<br>
+                                 uint32_t *pi_k_dates,<br>
+                                 int i_len )<br>
+{<br>
+    filter_sys_t   *p_sys = p_filter->p_sys;<br>
+    *pi_max_face_height = 0;<br>
+    *pp_lines = NULL;<br>
+    line_desc_t **pp_line_next = pp_lines;<br>
+<br>
+    FT_BBox bbox = { .xMin = INT_MAX, .yMin = INT_MAX,<br>
+                     .xMax = INT_MIN, .yMax = INT_MIN };<br>
+<br>
+    int i_face_height_previous = 0, i_base_line = 0;<br>
+    //const text_style_t *p_previous_style = NULL;<br>
+    //FT_Face p_face = NULL;<br>
+<br>
+    /*<br>
+     * Convert to UTF-8 for Pango. psz_text is not recognized as<br>
+     * UCS-4 unless the byte order is reversed.<br>
+     */<br>
+    for( int i = 0; i < i_len; ++i )<br>
+    {<br>
+        guchar *p = ( guchar * ) psz_text + i * sizeof( *psz_text );<br>
+        guchar ch0 = *( p ); guchar ch1 = *( p + 1 );<br>
+        guchar ch2 = *( p + 2 ); guchar ch3 = *( p + 3 );<br>
+<br>
+        *( p + 0 ) = ch3; *( p + 1 ) = ch2;<br>
+        *( p + 2 ) = ch1; *( p + 3 ) = ch0;<br>
+    }<br>
+<br>
+    gsize l_bytes_read = 0, l_bytes_written = 0;<br>
+    gchar *psz_utf8 = g_convert( ( gchar * ) psz_text, i_len * sizeof( *psz_text ),<br>
+                                 "UTF-8", "UCS-4", &l_bytes_read, &l_bytes_written, 0 );<br>
+    if( !psz_utf8 ) {<br>
+        msg_Err( p_filter, "Failed to convert text to UTF-8" );<br>
+        return VLC_EGENERIC;<br>
+    }<br>
+<br>
+    PangoFontMap *p_fm;<br>
+    PangoContext *p_context;<br>
+    PangoFontDescription *p_font_desc;<br>
+    PangoLayout *p_layout;<br>
+<br>
+    p_fm = p_sys->p_fontmap;<br>
+    p_context = pango_font_map_create_context(p_fm);<br>
+    if( !p_context )<br>
+    {<br>
+        msg_Err( p_filter, "Failed to create a Pango context" );<br>
+        g_free( psz_utf8 );<br>
+        return VLC_EGENERIC;<br>
+    }<br>
+    pango_context_set_base_dir( p_context, PANGO_DIRECTION_LTR );<br>
+    p_font_desc = pango_font_description_new();<br>
+    pango_font_description_set_family( p_font_desc, p_sys->style.psz_fontname );<br>
+    pango_font_description_set_absolute_size( p_font_desc, p_sys->style.i_font_size * PANGO_SCALE );<br>
+    pango_context_set_font_description( p_context, p_font_desc );<br>
+    p_layout = pango_layout_new( p_context );<br>
+    pango_layout_set_width( p_layout, ( int ) p_filter->fmt_out.video.i_visible_width<br>
+                            * PANGO_SCALE );<br>
+    pango_layout_set_height( p_layout, ( int ) p_filter->fmt_out.video.i_visible_height<br>
+                             * PANGO_SCALE );<br>
+    pango_layout_set_auto_dir( p_layout, false );<br>
+<br>
+    /* Set attributes that affect text shaping */<br>
+    PangoAttrList *p_list = pango_attr_list_new();<br>
+    text_style_t *p_style = pp_styles[0];<br>
+    gchar *p0 = psz_utf8, *p1 = psz_utf8;<br>
+    for( int i = 0; i < i_len; ++i ) {<br>
+        if( !FaceStyleEquals( p_style, pp_styles[i] ) ||<br>
+            p_style->i_font_size != pp_styles[i]->i_font_size ||<br>
+            i == i_len - 1 )<br>
+        {<br>
+            p_font_desc = pango_font_description_new();<br>
+            pango_font_description_set_family( p_font_desc, p_style->psz_fontname );<br>
+            pango_font_description_set_absolute_size( p_font_desc,<br>
+                                                     p_style->i_font_size * PANGO_SCALE );<br>
+<br>
+            PangoAttribute *p_attr = pango_attr_font_desc_new( p_font_desc );<br>
+            p_attr->start_index = p0 - psz_utf8;<br>
+            p_attr->end_index = p1 - psz_utf8;<br>
+            pango_attr_list_insert( p_list, p_attr );<br>
+            p_style = pp_styles[i];<br>
+            p0 = p1;<br>
+        }<br>
+        p1 = g_utf8_next_char(p1);<br>
+    }<br>
+<br>
+    /*    Perform text shaping    */<br>
+    pango_layout_set_attributes( p_layout, p_list );<br>
+    pango_layout_set_text( p_layout, psz_utf8, l_bytes_written);<br>
+<br>
+    /*  Now the text has been laid out, fill our pp_lines  */<br>
+    GSList *p_layout_lines = pango_layout_get_lines( p_layout );<br>
+    while( p_layout_lines ) {<br>
+        PangoLayoutLine *p_layout_line = ( PangoLayoutLine * ) p_layout_lines->data;<br>
+        int i_glyph_count = 0;<br>
+        GSList *p_runs = p_layout_line->runs;<br>
+        while( p_runs ) {<br>
+            PangoLayoutRun *p_run = ( PangoLayoutRun * ) p_runs->data;<br>
+            PangoGlyphString *p_glyphs = p_run->glyphs;<br>
+            i_glyph_count += p_glyphs->num_glyphs;<br>
+            p_runs = g_slist_next( p_runs );<br>
+        }<br>
+<br>
+        line_desc_t *p_line = i_glyph_count > 0 ? NewLine( i_glyph_count ) : NULL;<br>
+        if( p_line ) p_line->i_character_count = i_glyph_count;<br>
+        FT_Vector pen = { .x = 0, .y = 0 };<br>
+        int i_font_width = p_sys->style.i_font_size;<br>
+        int i_face_height = 0;<br>
+        FT_BBox line_bbox = { .xMin = INT_MAX, .yMin = INT_MAX,<br>
+                              .xMax = INT_MIN, .yMax = INT_MIN };<br>
+<br>
+        /*<br>
+         * A cluster denotes a base glyph + its diacritics (accent glyphs). To apply<br>
+         * remaining styles to glyphs we need to iterate through the clusters. That<br>
+         * seems to be the only way for us to know which source text characters correspond<br>
+         * to which glyphs. There's no 1:1 mapping here since Pango may have performed<br>
+         * glyph substitutions according to OpenType tables.<br>
+         */<br>
+        p_runs = p_layout_line->runs;<br>
+        int i_line_index = 0;<br>
+        while( p_runs ) {<br>
+            PangoLayoutRun *p_run = ( PangoLayoutRun * ) p_runs->data;<br>
+            PangoGlyphString *p_glyphs = p_run->glyphs;<br>
+            PangoGlyphItemIter cluster_iter;<br>
+            gboolean b_rtl = p_run->item->analysis.level % 2;<br>
+            gboolean b_have_cluster = b_rtl ?<br>
+                pango_glyph_item_iter_init_end( &cluster_iter, p_run, psz_utf8 ) :<br>
+                pango_glyph_item_iter_init_start( &cluster_iter, p_run, psz_utf8 );<br>
+<br>
+            for (             ;<br>
+                  b_have_cluster;<br>
+                  b_have_cluster = b_rtl ?<br>
+                    pango_glyph_item_iter_prev_cluster ( &cluster_iter ) :<br>
+                    pango_glyph_item_iter_next_cluster ( &cluster_iter ) )<br>
+            {<br>
+                int i_glyph_index = b_rtl ? cluster_iter.end_glyph + 1 :<br>
+                                              cluster_iter.start_glyph;<br>
+                while( b_rtl ? ( i_glyph_index != cluster_iter.start_glyph + 1) :<br>
+                               ( i_glyph_index != cluster_iter.end_glyph ) )<br>
+                {<br>
+                    FT_Glyph glyph;<br>
+                    FT_BBox  glyph_bbox;<br>
+                    FT_Glyph outline;<br>
+                    FT_BBox  outline_bbox;<br>
+                    FT_Glyph shadow;<br>
+                    FT_BBox  shadow_bbox;<br>
+<br>
+                    FT_Face p_face = pango_fc_font_lock_face(<br>
+                        PANGO_FC_FONT( p_run->item->analysis.font ) );<br>
+                    i_face_height = __MAX( i_face_height,<br>
+                        FT_CEIL(FT_MulFix(p_face->height, p_face->size->metrics.y_scale)));<br>
+<br>
+                    FT_Vector pen_new;<br>
+                    /* Divide by 16 to convert from PangoUnits to FreeType's 26.6 format */<br>
+                    pen_new.x = pen.x + p_glyphs->glyphs[i_glyph_index].geometry.x_offset / 16;<br>
+                    pen_new.y = pen.y - p_glyphs->glyphs[i_glyph_index].geometry.y_offset / 16;<br>
+<br>
+                    FT_Vector pen_shadow = {<br>
+                        .x = pen_new.x + p_sys->f_shadow_vector_x * (i_font_width << 6),<br>
+                        .y = pen_new.y + p_sys->f_shadow_vector_y * (i_font_width << 6),<br>
+                    };<br>
+<br>
+                    if( GetGlyph( p_filter,<br>
+                            &glyph, &glyph_bbox,<br>
+                            &outline, &outline_bbox,<br>
+                            &shadow, &shadow_bbox,<br>
+                            p_face, p_glyphs->glyphs[i_glyph_index].glyph, 0,<br>
+                            &pen_new, &pen_shadow) )<br>
+                    {<br>
+                        p_line->p_character[i_line_index++] = (line_character_t) {0};<br>
+                        pango_fc_font_unlock_face(<br>
+                            PANGO_FC_FONT( p_run->item->analysis.font ) );<br>
+                        ++i_glyph_index;<br>
+                        continue;<br>
+                    }<br>
+<br>
+                    FixGlyph( glyph, &glyph_bbox, p_face, &pen_new );<br>
+                    if( outline )<br>
+                        FixGlyph( outline, &outline_bbox, p_face, &pen_new );<br>
+                    if( shadow )<br>
+                        FixGlyph( shadow, &shadow_bbox, p_face, &pen_shadow );<br>
+<br>
+                    pango_fc_font_unlock_face(<br>
+                        PANGO_FC_FONT( p_run->item->analysis.font ) );<br>
+<br>
+                    FT_BBox line_bbox_new = line_bbox;<br>
+                    BBoxEnlarge( &line_bbox_new, &glyph_bbox );<br>
+                    if( outline )<br>
+                        BBoxEnlarge( &line_bbox_new, &outline_bbox );<br>
+                    if( shadow )<br>
+                        BBoxEnlarge( &line_bbox_new, &shadow_bbox );<br>
+<br>
+                    gchar *p = &psz_utf8[ cluster_iter.start_index ];<br>
+                    int i_index_in_chars = g_utf8_strlen( psz_utf8, p - psz_utf8 );<br>
+                    p_style = pp_styles[ i_index_in_chars ];<br>
+                    p_line->p_character[ i_line_index++ ] = (line_character_t) {<br>
+                        .p_glyph = (FT_BitmapGlyph) glyph,<br>
+                        .p_outline = (FT_BitmapGlyph) outline,<br>
+                        .p_shadow = (FT_BitmapGlyph) shadow,<br>
+                        .i_color = p_style->i_font_color | p_style->i_font_alpha << 24,<br>
+                        .i_line_offset = 0,<br>
+                        .i_line_thickness = 0,<br>
+                    };<br>
+<br>
+                    /*<br>
+                     * We're now using Pango for glyph positioning. If 2 glyphs should<br>
+                     * be rendered on top of one another, the width of the first will<br>
+                     * be 0. So we no longer have to check for diacritics or zero-width<br>
+                     * spaces, etc.<br>
+                     */<br>
+                    pen.x = pen.x + p_glyphs->glyphs[i_glyph_index].geometry.width / 16;<br>
+                    line_bbox = line_bbox_new;<br>
+                    ++i_glyph_index;<br>
+                }<br>
+            }<br>
+            p_runs = g_slist_next(p_runs);<br>
+        }<br>
+<br>
+        if( i_face_height_previous > 0 )<br>
+            i_base_line += __MAX(i_face_height, i_face_height_previous);<br>
+        if( i_face_height > 0 )<br>
+            i_face_height_previous = i_face_height;<br>
+<br>
+<br>
+        /* Update the line bbox with the actual base line */<br>
+        if (line_bbox.yMax > line_bbox.yMin) {<br>
+            line_bbox.yMin -= i_base_line;<br>
+            line_bbox.yMax -= i_base_line;<br>
+        }<br>
+        BBoxEnlarge( &bbox, &line_bbox );<br>
+<br>
+        if( p_line )<br>
+        {<br>
+            p_line->i_width  = __MAX(line_bbox.xMax - line_bbox.xMin, 0);<br>
+            p_line->i_base_line = i_base_line;<br>
+            p_line->i_height = __MAX(i_face_height, i_face_height_previous);<br>
+            *pp_line_next = p_line;<br>
+            pp_line_next = &p_line->p_next;<br>
+<br>
+        }<br>
+        *pi_max_face_height = __MAX( *pi_max_face_height, i_face_height );<br>
+<br>
+        p_layout_lines = g_slist_next(p_layout_lines);<br>
+    }<br>
+<br>
+    g_object_unref( p_layout );<br>
+    pango_font_description_free( p_font_desc );<br>
+    g_object_unref( p_context );<br>
+    if( psz_utf8 ) g_free( psz_utf8 );<br>
+<br>
+    *p_bbox = bbox;<br>
+    return VLC_SUCCESS;<br>
+}<br>
+#endif<br>
+<br>
 static int ProcessLines( filter_t *p_filter,<br>
                          line_desc_t **pp_lines,<br>
                          FT_BBox     *p_bbox,<br>
@@ -1767,9 +2047,20 @@ static int RenderCommon( filter_t *p_filter, subpicture_region_t *p_region_out,<br>
<br>
     if( !rv && i_text_length > 0 )<br>
     {<br>
+#ifdef HAVE_PANGOFT<br>
+        if( var_InheritBool( p_filter, "freetype-pango" ) )<br>
+            rv = ProcessLinesPangoFT2( p_filter,<br>
+                                       &p_lines, &bbox, &i_max_face_height,<br>
+                                       psz_text, pp_styles, pi_k_durations, i_text_length );<br>
+        else<br>
+            rv = ProcessLines( p_filter,<br>
+                               &p_lines, &bbox, &i_max_face_height,<br>
+                               psz_text, pp_styles, pi_k_durations, i_text_length );<br>
+#else<br>
         rv = ProcessLines( p_filter,<br>
                            &p_lines, &bbox, &i_max_face_height,<br>
                            psz_text, pp_styles, pi_k_durations, i_text_length );<br>
+#endif<br>
     }<br>
<br>
     p_region_out->i_x = p_region_in->i_x;<br>
@@ -1865,6 +2156,9 @@ static int Init_FT( vlc_object_t *p_this,<br>
     filter_sys_t  *p_sys = p_filter->p_sys;<br>
<br>
     /* */<br>
+#ifdef HAVE_PANGOFT<br>
+    p_sys->p_fontmap = pango_ft2_font_map_new();<br>
+#endif<br>
     int i_error = FT_Init_FreeType( &p_sys->p_library );<br>
     if( i_error )<br>
     {<br>
@@ -2058,6 +2352,9 @@ static void Destroy_FT( vlc_object_t *p_this )<br>
         FT_Stroker_Done( p_sys->p_stroker );<br>
     FT_Done_Face( p_sys->p_face );<br>
     FT_Done_FreeType( p_sys->p_library );<br>
+#ifdef HAVE_PANGOFT<br>
+    g_object_unref( p_sys->p_fontmap );<br>
+#endif<br>
 }<br>
<br>
 /*****************************************************************************<br>
<span class="HOEnZb"><font color="#888888">--<br>
1.9.1<br>
<br>
</font></span></blockquote></div><br></div>