[vlc-commits] [Git][videolan/vlc][master] 13 commits: text_renderer: sapi: fix assert in RenderText

Sat Aug 12 17:42:59 UTC 2023


Jean-Baptiste Kempf pushed to branch master at VideoLAN / VLC


Commits:
f0f1a500 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: fix assert in RenderText

vout_subpictures:RenderText() now expects the subtitle to have been
rasterized, but we'll never render any text, so we want to signal that
the region must be discarded.

This is a bit inadequate since it means that text OSD and subtitles are
not displayed at all but we probably want to refactor this part later by
adding a new capability for text parsing before rendering.

- - - - -
a94585f0 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: add extern C and anonymous namespace

extern "C" is needed for the function that will be given to the virtual
table, and namespace {} avoid exposing the constructor of the struct
outside of the translation unit.

- - - - -
c2a4e901 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: create dedicated RAII MTA guard

- - - - -
1fa3327a by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: use lambda for static ops

Using a lambda avoids having a wrapper around the ops structure and
simplify the usage afterwards. It's also much simpler to write.

- - - - -
3fec42b0 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: use early return

Co-locate the error handling with the call generating the error,
allowing to simplify an indentation level from the success path and
improve readibility of the failible function call.

We can also remove the goto, which is not a good idea in C++ because of
RAII mechanisms.

- - - - -
487c6539 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: refactor to SelectVoice function

Separating the content of SelectVoice in a different function will allow
to easily make use of early return and greatly simplify the final code.
It also separates what is really the setup for Create() and potentially
dynamic features like voice selection.

- - - - -
0fbcfcaf by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: early return

- - - - -
93b1f2d6 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: reindent after last changes

No functional changes.

- - - - -
35fc27dd by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: rename filter_sys_t

Using a less generic name allows referencing that name in a debugguer or
code completion tool, and prevent potential ODR when linkage is not done
correctly (eg. before when no namespace{} was present).

- - - - -
3c59de0f by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: avoid NULL long text option

- - - - -
3e30c375 by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: use new/delete for sys

We use a std::unique_ptr<> to handle the lifecycle of the sys pointer
with the error cases at the same time, since it will default to using
`delete` and will correctly release what's being new()-ed.

This pattern effectively replace the vlc_obj_malloc/free() primitive in
C++ code.

- - - - -
8d66cdea by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: move to dedicated COM thread

Move the Speech synthesis API calls into a dedicated MTA COM thread so
that it's always guaranteed that those objects live in such thread.
We use a producer-consumer fifo of one slot to prevent storing the text
regions inside the COM thread, and in particular prevent duplicating
them, through we currently keep duplicating the last text rendered for
simplicity.

The setup feels weird, given that we give a single instruction to a
dedicated thread and then wait for that thread to process it. But
everything boils down to thread COM model being global to a thread and
not local to a scope, which means that we must control in which thread
the allocated COM objects will be used and released (ie. the same as
the one creating them).

- - - - -
d277b0aa by Alexandre Janniaux at 2023-08-12T17:24:11+00:00
text_renderer: sapi: remove forward-declaration

Move the RenderTextMTA function up to remove the forward declaration.
The function is not 100% identical because an early return has been
added on the check for lastString.

- - - - -


1 changed file:

- modules/text_renderer/sapi.cpp


Changes:

=====================================
modules/text_renderer/sapi.cpp
=====================================
@@ -2,9 +2,11 @@
  * sapi.cpp: Simple text to Speech renderer for Windows, based on SAPI
  *****************************************************************************
  * Copyright (c) 2015 Moti Zilberman
+ * Copyright (c) 2023 Videolabs
  *
  * Authors: Moti Zilberman
  *          Jean-Baptiste Kempf
+ *          Alexandre Janniaux <ajanni at videolabs.io>
  *
  * The MIT License (MIT)
  *
@@ -37,21 +39,30 @@
 #include <vlc_filter.h>
 #include <vlc_charset.h>
 #include <vlc_subpicture.h>
+#include <vlc_threads.h>
+#include <vlc_cxx_helpers.hpp>
 
 #include <windows.h>
 #include <sapi.h>
 #include <sphelper.h>
 
 #include <initguid.h>
+
+#include <memory>
+
 // not available in standard libraries and used in inline functions without __uuidof()
 DEFINE_GUID(CLSID_SpObjectTokenCategory, 0xa910187f, 0x0c7a, 0x45ac, 0x92,0xcc, 0x59,0xed,0xaf,0xb7,0x7b,0x53);
 
+extern "C" {
 static int Create (filter_t *);
 static void Destroy(filter_t *);
 static int RenderText(filter_t *,
                       subpicture_region_t *,
                       subpicture_region_t *,
                       const vlc_fourcc_t *);
+}
+
+static int RenderTextMTA(filter_t *, subpicture_region_t *);
 
 vlc_module_begin ()
  set_description(N_("Speech synthesis for Windows"))
@@ -60,175 +71,264 @@ vlc_module_begin ()
 
  set_callback_text_renderer(Create, 0)
  /* Note: Skip label translation - too technical */
- add_integer("sapi-voice", -1, "Voice Index", nullptr)
+ add_integer("sapi-voice", -1, "Voice Index", "Voice Index")
 vlc_module_end ()
 
-struct filter_sys_t
+namespace {
+
+enum class FilterCommand {
+    CMD_RENDER_TEXT,
+    CMD_EXIT,
+};
+
+struct filter_sapi
 {
+    /* We need a MTA thread, so we ensure it's possible by creating a
+     * dedicated thread for that mode. */
+    vlc_thread_t thread;
+
     ISpVoice* cpVoice;
     char* lastString;
+
+    bool initialized;
+    vlc::threads::semaphore sem_ready;
+    vlc::threads::semaphore cmd_available;
+    vlc::threads::semaphore cmd_ready;
+
+    struct {
+        FilterCommand query;
+        union {
+            struct {
+                int result;
+                subpicture_region_t *region;
+            } render_text;
+        };
+    } cmd;
 };
 
-/* MTA functions */
-static int TryEnterMTA(vlc_object_t *obj)
+struct MTAGuard
 {
-    HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED);
-    if (unlikely(FAILED(hr)))
+    HRESULT result_mta;
+
+    MTAGuard()
     {
-        msg_Err (obj, "cannot initialize COM (error 0x%lX)", hr);
-        return -1;
+        this->result_mta = CoInitializeEx(NULL, COINIT_MULTITHREADED);
     }
-    return 0;
-}
-#define TryEnterMTA(o) TryEnterMTA(VLC_OBJECT(o))
 
-static void EnterMTA(void)
-{
-    HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED);
-    if (unlikely(FAILED(hr)))
-        abort();
-}
+    ~MTAGuard()
+    {
+        if (SUCCEEDED(this->result_mta))
+            CoUninitialize();
+    }
 
-static void LeaveMTA(void)
-{
-    CoUninitialize();
+};
 }
 
-static const struct FilterOperationInitializer {
-    struct vlc_filter_operations ops {};
-    FilterOperationInitializer()
-    {
-        ops.render = RenderText;
-        ops.close  = Destroy;
-    };
-} filter_ops;
-
-static int Create (filter_t *p_filter)
+static int RenderTextMTA(filter_t *p_filter,
+        subpicture_region_t * p_region_in)
 {
-    filter_sys_t *p_sys;
-    HRESULT hr;
+    struct filter_sapi *p_sys = static_cast<struct filter_sapi *>( p_filter->p_sys );
+    text_segment_t *p_segment = p_region_in->p_text;
 
-    if (TryEnterMTA(p_filter))
+    if (!p_segment)
         return VLC_EGENERIC;
 
-    p_filter->p_sys = p_sys = (filter_sys_t*) malloc(sizeof(filter_sys_t));
-    if (!p_sys)
-        goto error;
+    for (const text_segment_t *s = p_segment; s != NULL; s = s->p_next ) {
+        if (!s->psz_text)
+            continue;
 
-    p_sys->cpVoice = NULL;
-    p_sys->lastString = NULL;
+        if (strlen(s->psz_text) == 0)
+            continue;
 
-    hr = CoCreateInstance(__uuidof(SpVoice), NULL, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&p_sys->cpVoice));
-    if (SUCCEEDED(hr)) {
-        ISpObjectToken*        cpVoiceToken = NULL;
-        IEnumSpObjectTokens*   cpEnum = NULL;
-        ULONG ulCount = 0;
+        if (p_sys->lastString && !strcmp(p_sys->lastString, s->psz_text))
+            continue;
 
-        hr = SpEnumTokens(SPCAT_VOICES, NULL, NULL, &cpEnum);
-        if (SUCCEEDED(hr))
-        {
-            // Get the number of voices.
-            hr = cpEnum->GetCount(&ulCount);
-            if (SUCCEEDED (hr))
-            {
-                int voiceIndex = var_InheritInteger(p_filter, "sapi-voice");
-                if (voiceIndex > -1)
-                {
-                    if ((unsigned)voiceIndex < ulCount) {
-                        hr = cpEnum->Item(voiceIndex, &cpVoiceToken);
-                        if (SUCCEEDED(hr)) {
-                            hr = p_sys->cpVoice->SetVoice(cpVoiceToken);
-                            if (SUCCEEDED(hr)) {
-                                msg_Dbg(p_filter, "Selected voice %d", voiceIndex);
-                            }
-                            else {
-                                msg_Err(p_filter, "Failed to set voice %d", voiceIndex);
-                            }
-                            cpVoiceToken->Release();
-                            cpVoiceToken = NULL;
-                        }
-                    }
-                    else
-                        msg_Err(p_filter, "Voice index exceeds available count");
-                }
-            }
-            cpEnum->Release();
-
-            /* Set Output */
-            hr = p_sys->cpVoice->SetOutput(NULL, TRUE);
-        }
-    }
-    else
-    {
-        msg_Err(p_filter, "Could not create SpVoice");
-        goto error;
-    }
+        if (!strcmp(s->psz_text, "\n"))
+            continue;
 
-    LeaveMTA();
+        /* */
+        free(p_sys->lastString);
+        p_sys->lastString = strdup(s->psz_text);
 
-    p_filter->ops = &filter_ops.ops;
+        /* */
+        if (p_sys->lastString == nullptr)
+            continue;
 
-    return VLC_SUCCESS;
+        msg_Dbg(p_filter, "Speaking '%s'", s->psz_text);
 
-error:
-    LeaveMTA();
-    free(p_sys);
-    return VLC_EGENERIC;
+        wchar_t* wideText = ToWide(s->psz_text);
+        HRESULT hr = p_sys->cpVoice->Speak(wideText, SPF_ASYNC, NULL);
+        free(wideText);
+        if (!SUCCEEDED(hr))
+            msg_Err(p_filter, "Speak() error");
+    }
+
+    return VLC_SUCCESS;
 }
 
-static void Destroy(filter_t *p_filter)
+static int SelectVoice(filter_t *filter, ISpVoice* cpVoice)
 {
-    filter_sys_t *p_sys = reinterpret_cast<filter_sys_t *>( p_filter->p_sys );
+    HRESULT hr;
+    ISpObjectToken*        cpVoiceToken = NULL;
+    IEnumSpObjectTokens*   cpEnum = NULL;
+    ULONG ulCount = 0;
+
+    int voiceIndex = var_InheritInteger(filter, "sapi-voice");
+    if (voiceIndex < 0)
+        return 0;
 
-    if (p_sys->cpVoice)
-        p_sys->cpVoice->Release();
+    hr = SpEnumTokens(SPCAT_VOICES, NULL, NULL, &cpEnum);
+    if (!SUCCEEDED(hr))
+        return -ENOENT;
 
-    free(p_sys->lastString);
-    free(p_sys);
+    // Get the number of voices.
+    hr = cpEnum->GetCount(&ulCount);
+    if (!SUCCEEDED (hr))
+        goto error;
+
+    if ((unsigned)voiceIndex >= ulCount) {
+        msg_Err(filter, "Voice index exceeds available count");
+        cpEnum->Release();
+        return -EINVAL;
+    }
+    hr = cpEnum->Item(voiceIndex, &cpVoiceToken);
+    if (!SUCCEEDED(hr))
+        goto error;
+
+    hr = cpVoice->SetVoice(cpVoiceToken);
+    if (SUCCEEDED(hr)) {
+        msg_Dbg(filter, "Selected voice %d", voiceIndex);
+    }
+    else {
+        msg_Err(filter, "Failed to set voice %d", voiceIndex);
+    }
+    cpVoiceToken->Release();
+    cpVoiceToken = NULL;
+    cpEnum->Release();
+
+    return voiceIndex;
+
+error:
+    cpEnum->Release();
+    return -ENOENT;
 }
 
 static int RenderText(filter_t *p_filter,
         subpicture_region_t *,
-        subpicture_region_t *p_region_in,
+        subpicture_region_t *region_in,
         const vlc_fourcc_t *)
 {
-    filter_sys_t *p_sys = reinterpret_cast<filter_sys_t *>( p_filter->p_sys );
-    text_segment_t *p_segment = p_region_in->p_text;
+    auto *sys = static_cast<struct filter_sapi *>( p_filter->p_sys );
+    sys->cmd.query = FilterCommand::CMD_RENDER_TEXT;
+    sys->cmd.render_text.region = region_in;
+    sys->cmd_available.post();
+    sys->cmd_ready.wait();
+    return VLC_EGENERIC; /* We don't generate output region. */
+}
 
-    if (!p_segment)
-        return VLC_EGENERIC;
+static const struct vlc_filter_operations filter_ops = []{
+    struct vlc_filter_operations ops {};
+    ops.render = RenderText;
+    ops.close  = Destroy;
+    return ops;
+}();
 
-    for (const text_segment_t *s = p_segment; s != NULL; s = s->p_next ) {
-        if (!s->psz_text)
-            continue;
+static void *Run(void *opaque)
+{
+    filter_t *filter = static_cast<filter_t *>(opaque);
+    filter_sapi *sys = static_cast<filter_sapi *>(filter->p_sys);
 
-        if (strlen(s->psz_text) == 0)
-            continue;
+    MTAGuard guard {};
+    if (FAILED(guard.result_mta))
+    {
+        sys->sem_ready.post();
+        return NULL;
+    }
 
-        if (p_sys->lastString && !strcmp(p_sys->lastString, s->psz_text))
-            continue;
 
-        if (!strcmp(s->psz_text, "\n"))
-            continue;
+    HRESULT hr;
+    hr = CoCreateInstance(__uuidof(SpVoice), NULL, CLSCTX_INPROC_SERVER,
+                          IID_PPV_ARGS(&sys->cpVoice));
+    if (!SUCCEEDED(hr))
+    {
+        msg_Err(filter, "Could not create SpVoice");
+        sys->sem_ready.post();
+        return NULL;
+    }
 
-        /* */
-        free(p_sys->lastString);
-        p_sys->lastString = strdup(s->psz_text);
+    int ret = SelectVoice(filter, sys->cpVoice);
+    (void) ret; /* TODO: we can detect whether we set the voice or not */
 
-        /* */
-        if (p_sys->lastString) {
-            msg_Dbg(p_filter, "Speaking '%s'", s->psz_text);
-
-            EnterMTA();
-            wchar_t* wideText = ToWide(s->psz_text);
-            HRESULT hr = p_sys->cpVoice->Speak(wideText, SPF_ASYNC, NULL);
-            free(wideText);
-            if (!SUCCEEDED(hr)) {
-                msg_Err(p_filter, "Speak() error");
-            }
-            LeaveMTA();
+    sys->initialized = true;
+    sys->sem_ready.post();
+
+    for (;;)
+    {
+        sys->cmd_available.wait();
+        switch (sys->cmd.query)
+        {
+            case FilterCommand::CMD_EXIT:
+                return NULL;
+            case FilterCommand::CMD_RENDER_TEXT:
+                sys->cmd.render_text.result =
+                    RenderTextMTA(filter, sys->cmd.render_text.region);
+                sys->cmd_ready.post();
+                break;
+            default:
+                vlc_assert_unreachable();
         }
     }
 
+    return NULL;
+}
+
+static int Create (filter_t *p_filter)
+{
+    std::unique_ptr<filter_sapi> sys {
+        new (std::nothrow) filter_sapi {}
+    };
+
+    if (sys == nullptr)
+        return VLC_ENOMEM;
+
+    p_filter->ops = &filter_ops;
+    p_filter->p_sys = sys.get();
+    std::unique_ptr<filter_t, void(*)(filter_t*)> guard {
+        p_filter, [](filter_t *filter)
+    {
+        filter->p_sys = nullptr;
+        filter->ops = nullptr;
+    }};
+
+
+    int ret = vlc_clone(&sys->thread, Run, p_filter);
+
+    if (ret != VLC_SUCCESS)
+        return VLC_ENOMEM;
+
+    sys->sem_ready.wait();
+    if (!sys->initialized)
+    {
+        vlc_join(sys->thread, NULL);
+        return VLC_ENOTSUP;
+    }
+
+    guard.release(); /* No need to clean filter. */
+    sys.release(); /* leak to p_filter */
     return VLC_SUCCESS;
 }
+
+static void Destroy(filter_t *p_filter)
+{
+    std::unique_ptr<filter_sapi> sys {
+        static_cast<filter_sapi *>(p_filter->p_sys)
+    };
+
+    if (sys->cpVoice)
+        sys->cpVoice->Release();
+    free(sys->lastString);
+
+    sys->cmd.query = FilterCommand::CMD_EXIT;
+    sys->cmd_available.post();
+    vlc_join(sys->thread, NULL);
+}



View it on GitLab: https://code.videolan.org/videolan/vlc/-/compare/dbefe455bc08cd17f3cb6c7c9992ff4e74fc03d6...d277b0aafc780e56df7ef1aa16efcf3761cb039f

-- 
View it on GitLab: https://code.videolan.org/videolan/vlc/-/compare/dbefe455bc08cd17f3cb6c7c9992ff4e74fc03d6...d277b0aafc780e56df7ef1aa16efcf3761cb039f
You're receiving this email because of your account on code.videolan.org.


VideoLAN code repository instance