[vlc-devel] [RFC, PATCH 1/6] demux/mkv: Added EbmlTypeDispatcher (including changes to Makefile.am)

Filip Roséen filip at videolabs.io
Thu Mar 3 13:30:22 CET 2016


The EbmlTypeDispatcher is a type used to "dispatch" an object of type
EbmlElement* to an appropriate callback depending on the dynamic type of
the referred to object.

It can effectivelly replace the massive if-else branches found
throughout the module by not only making the code easier to understand
and maintain, but also by making it _a lot_ faster (benchmarks show a
speed increase between 450 and 700% in terms of lookup).

--

-----------------------------------------------------------------------
dispatcher.hpp
-----------------------------------------------------------------------

This header declares a bunch of helpers to easily interact and register
callbacks to a certain dispatcher (such as EbmlTypeDispatcher), as well
as thread-safely creating and initializing such dispatcher.

Since this module is stuck in C++03 land, there are some quirks in the
language specification that does not allow us to write as clean code as
we would want.

In C++11 we could use a lookup-table filled with lambdas, but since the
feature (obviously) is not available in C++03 we have two alternatives.

  - Write the function declaration + definition by itself, and then
    manually register the function in the appropriate instance of the
    dispatcher type.

  - Introduce MACROs that does this automatically through some clever
    loop holes in the language (that are required to exists according to
    the ISO/IEC 14882:2003(E) C++ Standard).

`dispatcher.hpp` choose the latter route, because the first alternative
is even more ugly than a massive if-else tree (and very error prone)--
plus:

  - The functions would be distant by definition so by looking at the
    code responsible for registering the callback will not tell you what
    will actually be executed.

  - Implement the function but forget to include it in the dispatcher
    will yield undesirable behavior.

  - Removing the code responsible for registrering the callback will
    create lingering function definitions, which will affect
    maintenance.

  - Introducing two handlers for the same type would not yield a compile
    time diagnostic.

The MACRO approach (even though I try to avoid MACROs at all cost)
solves all of the issues listed above, and more.

See the comments in `module/mkv/demux/dispatcher.hpp` for more
information regarding its contents.

-----------------------------------------------------------------------
Ebml_dispatcher.hpp
-----------------------------------------------------------------------

This header makes use of the functionality presented by
`dispatcher.hpp`, and effectivelly allows a developer to write code as
the following.

    struct EbmlPayload {
      matroska_segment_c * const parent;
      mkv_track_t        * const   p_tk;
      unsigned int                level;
    } captures = { this, tk, 3 };

    MKV_SWITCH_CREATE( EbmlTypeDispatcher, EbmlHandler, EbmlPayload, vars )
    {
      MKV_SWITCH_INIT();

      /* ... */

      E_CASE( KaxSeekPreRoll, spr ) {
        vars.tk->i_seek_preroll = uint64_t( spr ) / 1000;
      }

      E_CASE( KaxTrackAudio, tka) {
        vars.p_tk->fmt.audio.i_channels = 1;
        vars.p_tk->fmt.audio_i_rate = 8000;

        vars.level += 1;

        dispatcher.iterate( tka.begin(), tka.end(), Payload( vars ));

        vars.level -= 1;
      }

      E_CASE( KaxAudioSamplingFreq, afreq ) {
        vars.p_tk->i_original_rate = vars.p_tk->fmt.audio.i_rate = (int)float( afreq );
      }

      E_CASE( KaxAudioOutputSamplingFreq, afreq ) {
        vars.p_tk->fmt.audio.i_rate  = (int)float( afreq );
      }

      E_CASE( KaxAudioChannels, achan ) {
        vars.p_tk->fmt.audio.i_channels = uint8( chan );
      }

      E_CASE( KaxAudioBitDepth, abits ) {
        vars.p_tk->fmt.audio.i_bitspersample = uint8( abits );
      }

      E_CASE( EbmlVoid, ) { // ignore
        VLC_UNUSED( vars );
      }

      E_CASE_DEFAULT(element) {
        MkvTree( *vars.p_demuxer, vars.level, "Unknown (%s)", typeid(element).name() );
      }
    };

    EbmlHandler::Dispatcher().iterate(
      m->begin(), m->end(), EbmlHandler::Payload( captures )
    );

Effictivelly replacing the old massive nested if-else trees, such as the
below taken from `matroska_segment_parse.cpp`.

    for( size_t i = 0; i < m->ListSize(); i++ )
    {
        EbmlElement *l = (*m)[i];

        if /* ... */
        else if( MKV_IS_ID( l, KaxSeekPreRoll ) )
        {
            KaxSeekPreRoll &spr = *(KaxSeekPreRoll*)l;
            tk->i_seek_preroll = uint64_t(spr) / 1000;
        }
        else  if( MKV_IS_ID( l, KaxTrackAudio ) )
        {
            EbmlMaster *tka = static_cast<EbmlMaster*>(l);

            /* Initialize default values */
            tk->fmt.audio.i_channels = 1;
            tk->fmt.audio.i_rate = 8000;

            for( unsigned int j = 0; j < tka->ListSize(); j++ )
            {
                EbmlElement *l = (*tka)[j];

                if( MKV_IS_ID( l, KaxAudioSamplingFreq ) )
                {
                    KaxAudioSamplingFreq &afreq = *(KaxAudioSamplingFreq*)l;

                    tk->i_original_rate = tk->fmt.audio.i_rate = (int)float( afreq );
                }
                else if( MKV_IS_ID( l, KaxAudioOutputSamplingFreq ) )
                {
                    KaxAudioOutputSamplingFreq &afreq = *(KaxAudioOutputSamplingFreq*)l;

                    tk->fmt.audio.i_rate = (int)float( afreq );
                }
                else if( MKV_IS_ID( l, KaxAudioChannels ) )
                {
                    KaxAudioChannels &achan = *(KaxAudioChannels*)l;

                    tk->fmt.audio.i_channels = uint8( achan );
                    msg_Dbg( &sys.demuxer, "|   |   |   |   + achan=%u", uint8( achan ) );
                }
                else if( MKV_IS_ID( l, KaxAudioBitDepth ) )
                {
                    KaxAudioBitDepth &abits = *(KaxAudioBitDepth*)l;

                    tk->fmt.audio.i_bitspersample = uint8( abits );
                }
                else if ( !MKV_IS_ID( l, EbmlVoid ) )
                {
                    MkvTree( sys.demuxer, 4, "Unknown (%s)", typeid(*l).name() );
                }
            }
        }
        else if ( !MKV_IS_ID( l, EbmlVoid ) )
        {
            MkvTree( sys.demuxer, 3, "Unknown (%s)", typeid(*l).name() );
        }
    }

-----------------------------------------------------------------------
How does it work?
-----------------------------------------------------------------------

The MACROs will result in a local struct (`EbmlHandler` in the previous
example) being declared, that has each of the case-handlers inside of
it.

This type will have a static function named `Dispatcher` that will yield
a constant reference to the desired dispatcher.

If this static instance of the dispatcher (one dispatcher per
MKV_SWITCH_CREATE)  has not yet been initialized, all of the
case-handlers will have their constructor executed effectivelly
making themselves available to the dispatcher.

If the static instance has already been constructed, we simply return a
reference to that one.

The comments in `dispatcher.hpp` and `Ebml_dispatcher.hpp` should be
explain things in more detail.

-----------------------------------------------------------------------
Is it faster?
-----------------------------------------------------------------------

The benchmarks shows that using the EbmlTypeDispatcher increases the
performance of the lookup (since it is using a lookup table instead of
if-else) by about 450% - 700% (sometimes more).

-----------------------------------------------------------------------
Will it ease code maintenance?
-----------------------------------------------------------------------

Hopefully.

-----------------------------------------------------------------------
Why not put everything in `Ebml_dispatcher.hpp`?
-----------------------------------------------------------------------

There are other massive if-else trees besides those related to
EbmlElement, with a separate "base" it will be easy to implement other
type of dispatchers.

-----------------------------------------------------------------------

Feedback is more than welcome, thanks!

Yours truly,
Filip Roséen
---
 modules/demux/Makefile.am             |   2 +
 modules/demux/mkv/Ebml_dispatcher.hpp | 161 ++++++++++++++++++++++++++++++++++
 modules/demux/mkv/dispatcher.hpp      | 150 +++++++++++++++++++++++++++++++
 3 files changed, 313 insertions(+)
 create mode 100644 modules/demux/mkv/Ebml_dispatcher.hpp
 create mode 100644 modules/demux/mkv/dispatcher.hpp

diff --git a/modules/demux/Makefile.am b/modules/demux/Makefile.am
index 11518fe..caf298c 100644
--- a/modules/demux/Makefile.am
+++ b/modules/demux/Makefile.am
@@ -173,7 +173,9 @@ libmkv_plugin_la_SOURCES = \
 	demux/mkv/matroska_segment.hpp demux/mkv/matroska_segment.cpp \
 	demux/mkv/matroska_segment_parse.cpp \
 	demux/mkv/demux.hpp demux/mkv/demux.cpp \
+	demux/mkv/dispatcher.hpp \
 	demux/mkv/Ebml_parser.hpp demux/mkv/Ebml_parser.cpp \
+	demux/mkv/Ebml_dispatcher.hpp \
 	demux/mkv/chapters.hpp demux/mkv/chapters.cpp \
 	demux/mkv/chapter_command.hpp demux/mkv/chapter_command.cpp \
 	demux/mkv/stream_io_callback.hpp demux/mkv/stream_io_callback.cpp \
diff --git a/modules/demux/mkv/Ebml_dispatcher.hpp b/modules/demux/mkv/Ebml_dispatcher.hpp
new file mode 100644
index 0000000..bb0c56a
--- /dev/null
+++ b/modules/demux/mkv/Ebml_dispatcher.hpp
@@ -0,0 +1,161 @@
+/*****************************************************************************
+ * Ebml_dispatcher.hpp : matroska demuxer
+ *****************************************************************************
+ * Copyright (C) 2016 Videolabs
+ * $Id$
+ *
+ * Authors: Filip Roseen <filip at videolabs.io>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License as published by
+ * the Free Software Foundation; either version 2.1 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.
+ *****************************************************************************/
+#ifndef VLC_MKV_EBML_DISPATCHER_HPP_
+#define VLC_MKV_EBML_DISPATCHER_HPP_
+
+#include "dispatcher.hpp"
+
+#include "ebml/EbmlElement.h"
+#include "ebml/EbmlId.h"
+
+#include <vlc_threads.h>
+
+#include <algorithm>
+#include <typeinfo>
+#include <vector>
+
+namespace {
+  namespace detail {
+    template<class T>
+    static std::type_info const* typeid_ptr () {
+      static std::type_info const& ti = typeid (T);
+      return &ti;
+    }
+  }
+
+  struct EbmlProcessorEntry {
+    typedef void (*EbmlProcessor) (EbmlElement*, void*);
+
+    EbmlId         const* p_ebmlid;
+    std::type_info const* p_typeid;
+    EbmlProcessor         callback;
+
+    EbmlProcessorEntry (EbmlId const& id, std::type_info const* ti, EbmlProcessor cb)
+      : p_ebmlid (&id), p_typeid (ti), callback (cb)
+    { }
+  };
+
+  struct ProcessorEntrySorter {
+    typedef EbmlProcessorEntry value_type;
+
+    bool operator() (value_type const& lhs, value_type const& rhs) const {
+      EbmlId const& lid = *lhs.p_ebmlid;
+      EbmlId const& rid = *rhs.p_ebmlid;
+
+      return lid.GetLength() < rid.GetLength() || (
+        !( rid.GetLength() < lid.GetLength() ) && lid.GetValue() < rid.GetValue()
+      );
+    }
+  };
+
+  class EbmlTypeDispatcher : public Dispatcher<EbmlTypeDispatcher, EbmlProcessorEntry::EbmlProcessor> {
+    protected:
+      typedef std::vector<EbmlProcessorEntry> ProcessorContainer;
+
+    public:
+      void insert (EbmlProcessorEntry const& data) {
+        _processors.push_back (data);
+      }
+
+      void on_create () {
+        std::sort (_processors.begin(), _processors.end(), _ebml_sorter);
+      }
+
+      bool send (EbmlElement * const& element, void* payload) const
+      {
+        EbmlProcessorEntry eb = EbmlProcessorEntry (
+          static_cast<EbmlId const&> (*element), NULL, NULL
+        );
+
+        // --------------------------------------------------------------
+        // Find the appropriate callback for the received EbmlElement
+        // --------------------------------------------------------------
+
+        ProcessorContainer::const_iterator cit_end = _processors.end();
+        ProcessorContainer::const_iterator cit     = std::lower_bound (
+            _processors.begin(), cit_end, eb, _ebml_sorter
+        );
+
+        if (element && cit != cit_end)
+        {
+          // --------------------------------------------------------------
+          // normally we only need to compare the addresses of the EbmlId
+          // since libebml returns a reference to a _static_ instance.
+          // --------------------------------------------------------------
+
+          while (cit != cit_end && (cit->p_ebmlid == eb.p_ebmlid || (*cit->p_ebmlid == *eb.p_ebmlid))) {
+            std::type_info const& ti = typeid (*element);
+
+            // --------------------------------------------------------------
+            // even though the EbmlId are equivalent, we still need to make
+            // sure that the typeid also matches.
+            // --------------------------------------------------------------
+
+            if (*(cit->p_typeid) == ti) {
+              cit->callback (element, payload);
+              return true;
+            }
+
+            ++cit;
+          }
+        }
+
+        if (_default_handler == NULL)
+            return false;
+
+        _default_handler (element, payload);
+        return true;
+      }
+
+    public:
+      ProcessorContainer           _processors;
+      static ProcessorEntrySorter _ebml_sorter;
+  };
+
+} /* end-of-namespace */
+
+#define EBML_ELEMENT_CASE_DEF(EbmlType_, ClassName_, VariableName_, InitializationExpr_) \
+    MKV_SWITCH_CASE_DEFINITION( ClassName_, EbmlType_, EbmlElement*, VariableName_, vars,          \
+      InitializationExpr_, static_cast<EbmlType_&> (*data)                            \
+    )
+
+// -----------------------------------------------------------------------------------
+// The use of `detail::typeid_ptr` below is so that we do not have to invoke "typeid"
+// every time we are requested to do a lookup. `std::type_info` cannot be copied, so
+// we cannot pass it by value.
+//
+// In C++11 you could use the hash value present inside std::type_info to do lookup,
+// but we are stuck in C++03 and have to use the below instead.
+// -----------------------------------------------------------------------------------
+
+#define E_CASE(EbmlType_, VariableName_)            \
+    EBML_ELEMENT_CASE_DEF(EbmlType_, EbmlType_, VariableName_, \
+      (dispatcher.insert( EbmlProcessorEntry( EbmlType_ ::ClassInfos.ClassId(), detail::typeid_ptr<EbmlType_>(), &EbmlType_ ## _callback) ) ) \
+    )
+
+#define E_CASE_DEFAULT(VariableName_)                    \
+    EBML_ELEMENT_CASE_DEF(EbmlElement, ebml_default, VariableName_, \
+      dispatcher.set_default_handler (&ebml_default_callback)    \
+    )
+
+#endif
diff --git a/modules/demux/mkv/dispatcher.hpp b/modules/demux/mkv/dispatcher.hpp
new file mode 100644
index 0000000..2f6169e
--- /dev/null
+++ b/modules/demux/mkv/dispatcher.hpp
@@ -0,0 +1,150 @@
+/*****************************************************************************
+ * dispatcher.hpp : matroska demuxer
+ *****************************************************************************
+ * Copyright (C) 2016 Videolabs
+ * $Id$
+ *
+ * Authors: Filip Roseen <filip at videolabs.io>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License as published by
+ * the Free Software Foundation; either version 2.1 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.
+ *****************************************************************************/
+#ifndef VLC_MKV_DISPATCHER_HPP_
+#define VLC_MKV_DISPATCHER_HPP_
+
+// ----------------------------------------------------------------------------
+// This header contains helpers to simulate lambdas in C++03.
+//
+// It will be used to create "dispatchers" which can be thought of like a
+// `switch` but only more dynamic.
+// ----------------------------------------------------------------------------
+
+namespace {
+
+  template<class Impl, class Processor>
+  class Dispatcher {
+    protected:
+      Dispatcher() : _default_handler (NULL) { }
+    public:
+      template<class It>
+      void iterate (It beg, It end, void* const& payload) const {
+        for (; beg != end; ++beg)
+          static_cast<Impl const*> (this)->Impl::send (*beg, payload);
+      }
+
+      void set_default_handler (Processor const& callback) {
+        _default_handler = callback;
+      }
+
+      void on_create () {
+        /* empty default implementation */
+      }
+
+      Processor _default_handler;
+  };
+
+  struct DispatcherTag { };
+
+  template<DispatcherTag*, class DispatcherType>
+  class DispatchContainer {
+    public:    static DispatcherType dispatcher;
+    protected: static vlc_mutex_t   _dispatcher_lock;
+  };
+
+  template<DispatcherTag* P, class DT>
+  DT DispatchContainer<P, DT>::dispatcher;
+
+  template<DispatcherTag* P, class DT>
+  vlc_mutex_t DispatchContainer<P, DT>::_dispatcher_lock = VLC_STATIC_MUTEX;
+}
+
+// ----------------------------------------------------------------------------
+//   * `GroupName_ ## _tag` is used so that we can refer to a static dispatcher
+//      of the correct type without instantiating DispatchContainer with a
+//      locally declared type (since it is illegal in C++03).
+//
+//      We are however allowed to pass the variable of a variable declared
+//      `extern`, and this will effectively be our handle to the "foreign"
+//      static dispatcher.
+//
+//   *  `GroupName_ ## _base` is used to declare a bunch of helpers; names that
+//       must be available in our fake "lambdas" (C++03 is a pain).
+// ----------------------------------------------------------------------------
+
+#define MKV_SWITCH_CREATE(DispatchType_, GroupName_, PayloadType_) \
+  extern DispatcherTag GroupName_ ## _tag;                         \
+  struct GroupName_ ## _base : DispatchContainer<&GroupName_ ## _tag, DispatchType_> { \
+      typedef      PayloadType_ payload_t;                         \
+      typedef     DispatchType_ dispatch_t;                        \
+      typedef struct GroupName_ handler_t;                         \
+      static void* Payload (payload_t& data) {                     \
+          return static_cast<void*> (&data);                       \
+      }                                                            \
+  };                                                               \
+  struct GroupName_ : GroupName_ ## _base
+
+// ----------------------------------------------------------------------------
+//   * `Dispatcher` is a static function used to access the dispatcher in a
+//      thread-safe manner. We only want _one_ thread to actually construct
+//      and initialize it, hence the lock.
+// ----------------------------------------------------------------------------
+
+#define MKV_SWITCH_INIT()                     \
+  static dispatch_t& Dispatcher () {          \
+      static handler_t * p_handler = NULL;    \
+      vlc_mutex_lock( &_dispatcher_lock );    \
+      if (unlikely( p_handler == NULL) ) {    \
+          static handler_t handler;           \
+          p_handler = &handler;               \
+          p_handler->dispatcher.on_create (); \
+      }                                       \
+      vlc_mutex_unlock( &_dispatcher_lock );  \
+      return p_handler->dispatcher;           \
+  } struct PleaseAddSemicolon {}
+
+// ----------------------------------------------------------------------------
+//   * The following is to be used inside `struct GroupName_`, effectivelly
+//     declaring a local struct and a data-member, `ClassName_ ## _processor`,
+//     of that type.
+//
+//     When the data-member is constructed it will run `InitializationExpr_`,
+//     meaning that it can access the static dispatcher and register itself (or
+//     whatever is desired).
+//
+//   * Since we need to do type-erasure, once again, because of the fact that
+//     C++03 does not support locally declared types as template-arguments, we
+//     declare a static function `ClassName_ ## _callback` that will cast our
+//     payload from `void*` to the appropriate type.
+//
+//   * The body of `ClassName_ ## _handler` will be written by the user of the
+//     MACRO, and the implementation will effectively be invoked whenever the
+//     dispatcher decides to (through `ClassName_ ## _callback`).
+//
+//   * Since the type of the `VariableName_` argument to `ClassName_ ## _handler`
+//     might not necessarily be the same as the type passed from the dispatcher
+//     (because of the type-erasure), the macro provides a way to change the
+//     type with a `UnwrapExpr_` (see the function call within `ClassName_ ##
+//     _callback`).
+// ----------------------------------------------------------------------------
+
+#define MKV_SWITCH_CASE_DEFINITION(ClassName_, RealType_, Type_, VariableName_, PayloadName_, InitializationExpr_, UnwrapExpr_) \
+  struct ClassName_ ## _processor {                                             \
+    ClassName_ ## _processor () { InitializationExpr_; }                        \
+  } ClassName_ ## __wrapper;                                                    \
+  static inline void ClassName_ ## _callback (Type_ data, void* PayloadName_) { \
+    ClassName_ ## _handler (UnwrapExpr_, *static_cast<payload_t*> (vars));      \
+  }                                                                             \
+  static inline void ClassName_ ## _handler (RealType_& VariableName_, payload_t& PayloadName_)
+
+#endif
-- 
2.7.2



More information about the vlc-devel mailing list