[vlc-devel] [RFC, PATCH 1/6] demux/mkv: Added EbmlTypeDispatcher (including changes to Makefile.am)
Hugo Beauzée-Luyssen
hugo at beauzee.fr
Thu Mar 3 17:39:01 CET 2016
Hi,
On 03/03/2016 01:30 PM, Filip Roséen wrote:
> The EbmlTypeDispatcher is a type used to "dispatch" an object of type
> EbmlElement* to an appropriate callback depending on the dynamic type of
> the referred to object.
>
> It can effectivelly replace the massive if-else branches found
> throughout the module by not only making the code easier to understand
> and maintain, but also by making it _a lot_ faster (benchmarks show a
> speed increase between 450 and 700% in terms of lookup).
>
> --
>
> -----------------------------------------------------------------------
> dispatcher.hpp
> -----------------------------------------------------------------------
>
> This header declares a bunch of helpers to easily interact and register
> callbacks to a certain dispatcher (such as EbmlTypeDispatcher), as well
> as thread-safely creating and initializing such dispatcher.
>
> Since this module is stuck in C++03 land, there are some quirks in the
> language specification that does not allow us to write as clean code as
> we would want.
>
> In C++11 we could use a lookup-table filled with lambdas, but since the
> feature (obviously) is not available in C++03 we have two alternatives.
>
> - Write the function declaration + definition by itself, and then
> manually register the function in the appropriate instance of the
> dispatcher type.
>
> - Introduce MACROs that does this automatically through some clever
> loop holes in the language (that are required to exists according to
> the ISO/IEC 14882:2003(E) C++ Standard).
>
> `dispatcher.hpp` choose the latter route, because the first alternative
> is even more ugly than a massive if-else tree (and very error prone)--
> plus:
>
> - The functions would be distant by definition so by looking at the
> code responsible for registering the callback will not tell you what
> will actually be executed.
>
> - Implement the function but forget to include it in the dispatcher
> will yield undesirable behavior.
>
> - Removing the code responsible for registrering the callback will
> create lingering function definitions, which will affect
> maintenance.
>
> - Introducing two handlers for the same type would not yield a compile
> time diagnostic.
>
> The MACRO approach (even though I try to avoid MACROs at all cost)
> solves all of the issues listed above, and more.
>
> See the comments in `module/mkv/demux/dispatcher.hpp` for more
> information regarding its contents.
>
> -----------------------------------------------------------------------
> Ebml_dispatcher.hpp
> -----------------------------------------------------------------------
>
> This header makes use of the functionality presented by
> `dispatcher.hpp`, and effectivelly allows a developer to write code as
> the following.
>
> struct EbmlPayload {
> matroska_segment_c * const parent;
> mkv_track_t * const p_tk;
> unsigned int level;
> } captures = { this, tk, 3 };
>
> MKV_SWITCH_CREATE( EbmlTypeDispatcher, EbmlHandler, EbmlPayload, vars )
> {
> MKV_SWITCH_INIT();
>
> /* ... */
>
> E_CASE( KaxSeekPreRoll, spr ) {
> vars.tk->i_seek_preroll = uint64_t( spr ) / 1000;
> }
>
> E_CASE( KaxTrackAudio, tka) {
> vars.p_tk->fmt.audio.i_channels = 1;
> vars.p_tk->fmt.audio_i_rate = 8000;
>
> vars.level += 1;
>
> dispatcher.iterate( tka.begin(), tka.end(), Payload( vars ));
>
> vars.level -= 1;
> }
>
> E_CASE( KaxAudioSamplingFreq, afreq ) {
> vars.p_tk->i_original_rate = vars.p_tk->fmt.audio.i_rate = (int)float( afreq );
> }
>
> E_CASE( KaxAudioOutputSamplingFreq, afreq ) {
> vars.p_tk->fmt.audio.i_rate = (int)float( afreq );
> }
>
> E_CASE( KaxAudioChannels, achan ) {
> vars.p_tk->fmt.audio.i_channels = uint8( chan );
> }
>
> E_CASE( KaxAudioBitDepth, abits ) {
> vars.p_tk->fmt.audio.i_bitspersample = uint8( abits );
> }
>
> E_CASE( EbmlVoid, ) { // ignore
> VLC_UNUSED( vars );
> }
>
> E_CASE_DEFAULT(element) {
> MkvTree( *vars.p_demuxer, vars.level, "Unknown (%s)", typeid(element).name() );
> }
> };
>
> EbmlHandler::Dispatcher().iterate(
> m->begin(), m->end(), EbmlHandler::Payload( captures )
> );
>
> Effictivelly replacing the old massive nested if-else trees, such as the
> below taken from `matroska_segment_parse.cpp`.
>
> for( size_t i = 0; i < m->ListSize(); i++ )
> {
> EbmlElement *l = (*m)[i];
>
> if /* ... */
> else if( MKV_IS_ID( l, KaxSeekPreRoll ) )
> {
> KaxSeekPreRoll &spr = *(KaxSeekPreRoll*)l;
> tk->i_seek_preroll = uint64_t(spr) / 1000;
> }
> else if( MKV_IS_ID( l, KaxTrackAudio ) )
> {
> EbmlMaster *tka = static_cast<EbmlMaster*>(l);
>
> /* Initialize default values */
> tk->fmt.audio.i_channels = 1;
> tk->fmt.audio.i_rate = 8000;
>
> for( unsigned int j = 0; j < tka->ListSize(); j++ )
> {
> EbmlElement *l = (*tka)[j];
>
> if( MKV_IS_ID( l, KaxAudioSamplingFreq ) )
> {
> KaxAudioSamplingFreq &afreq = *(KaxAudioSamplingFreq*)l;
>
> tk->i_original_rate = tk->fmt.audio.i_rate = (int)float( afreq );
> }
> else if( MKV_IS_ID( l, KaxAudioOutputSamplingFreq ) )
> {
> KaxAudioOutputSamplingFreq &afreq = *(KaxAudioOutputSamplingFreq*)l;
>
> tk->fmt.audio.i_rate = (int)float( afreq );
> }
> else if( MKV_IS_ID( l, KaxAudioChannels ) )
> {
> KaxAudioChannels &achan = *(KaxAudioChannels*)l;
>
> tk->fmt.audio.i_channels = uint8( achan );
> msg_Dbg( &sys.demuxer, "| | | | + achan=%u", uint8( achan ) );
> }
> else if( MKV_IS_ID( l, KaxAudioBitDepth ) )
> {
> KaxAudioBitDepth &abits = *(KaxAudioBitDepth*)l;
>
> tk->fmt.audio.i_bitspersample = uint8( abits );
> }
> else if ( !MKV_IS_ID( l, EbmlVoid ) )
> {
> MkvTree( sys.demuxer, 4, "Unknown (%s)", typeid(*l).name() );
> }
> }
> }
> else if ( !MKV_IS_ID( l, EbmlVoid ) )
> {
> MkvTree( sys.demuxer, 3, "Unknown (%s)", typeid(*l).name() );
> }
> }
>
> -----------------------------------------------------------------------
> How does it work?
> -----------------------------------------------------------------------
>
> The MACROs will result in a local struct (`EbmlHandler` in the previous
> example) being declared, that has each of the case-handlers inside of
> it.
>
> This type will have a static function named `Dispatcher` that will yield
> a constant reference to the desired dispatcher.
>
> If this static instance of the dispatcher (one dispatcher per
> MKV_SWITCH_CREATE) has not yet been initialized, all of the
> case-handlers will have their constructor executed effectivelly
> making themselves available to the dispatcher.
>
> If the static instance has already been constructed, we simply return a
> reference to that one.
>
> The comments in `dispatcher.hpp` and `Ebml_dispatcher.hpp` should be
> explain things in more detail.
>
> -----------------------------------------------------------------------
> Is it faster?
> -----------------------------------------------------------------------
>
> The benchmarks shows that using the EbmlTypeDispatcher increases the
> performance of the lookup (since it is using a lookup table instead of
> if-else) by about 450% - 700% (sometimes more).
>
> -----------------------------------------------------------------------
> Will it ease code maintenance?
> -----------------------------------------------------------------------
>
> Hopefully.
>
> -----------------------------------------------------------------------
> Why not put everything in `Ebml_dispatcher.hpp`?
> -----------------------------------------------------------------------
>
> There are other massive if-else trees besides those related to
> EbmlElement, with a separate "base" it will be easy to implement other
> type of dispatchers.
>
> -----------------------------------------------------------------------
>
> Feedback is more than welcome, thanks!
>
> Yours truly,
> Filip Roséen
> ---
> modules/demux/Makefile.am | 2 +
> modules/demux/mkv/Ebml_dispatcher.hpp | 161 ++++++++++++++++++++++++++++++++++
> modules/demux/mkv/dispatcher.hpp | 150 +++++++++++++++++++++++++++++++
> 3 files changed, 313 insertions(+)
> create mode 100644 modules/demux/mkv/Ebml_dispatcher.hpp
> create mode 100644 modules/demux/mkv/dispatcher.hpp
>
> diff --git a/modules/demux/Makefile.am b/modules/demux/Makefile.am
> index 11518fe..caf298c 100644
> --- a/modules/demux/Makefile.am
> +++ b/modules/demux/Makefile.am
> @@ -173,7 +173,9 @@ libmkv_plugin_la_SOURCES = \
> demux/mkv/matroska_segment.hpp demux/mkv/matroska_segment.cpp \
> demux/mkv/matroska_segment_parse.cpp \
> demux/mkv/demux.hpp demux/mkv/demux.cpp \
> + demux/mkv/dispatcher.hpp \
> demux/mkv/Ebml_parser.hpp demux/mkv/Ebml_parser.cpp \
> + demux/mkv/Ebml_dispatcher.hpp \
> demux/mkv/chapters.hpp demux/mkv/chapters.cpp \
> demux/mkv/chapter_command.hpp demux/mkv/chapter_command.cpp \
> demux/mkv/stream_io_callback.hpp demux/mkv/stream_io_callback.cpp \
> diff --git a/modules/demux/mkv/Ebml_dispatcher.hpp b/modules/demux/mkv/Ebml_dispatcher.hpp
> new file mode 100644
> index 0000000..bb0c56a
> --- /dev/null
> +++ b/modules/demux/mkv/Ebml_dispatcher.hpp
> @@ -0,0 +1,161 @@
> +/*****************************************************************************
> + * Ebml_dispatcher.hpp : matroska demuxer
> + *****************************************************************************
> + * Copyright (C) 2016 Videolabs
> + * $Id$
> + *
> + * Authors: Filip Roseen <filip at videolabs.io>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License as published by
> + * the Free Software Foundation; either version 2.1 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public License
> + * along with this program; if not, write to the Free Software Foundation,
> + * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.
> + *****************************************************************************/
> +#ifndef VLC_MKV_EBML_DISPATCHER_HPP_
> +#define VLC_MKV_EBML_DISPATCHER_HPP_
> +
> +#include "dispatcher.hpp"
> +
> +#include "ebml/EbmlElement.h"
> +#include "ebml/EbmlId.h"
> +
> +#include <vlc_threads.h>
> +
> +#include <algorithm>
> +#include <typeinfo>
> +#include <vector>
> +
> +namespace {
> + namespace detail {
> + template<class T>
> + static std::type_info const* typeid_ptr () {
> + static std::type_info const& ti = typeid (T);
> + return &ti;
> + }
> + }
> +
Is the anonymous namespace needed/useful in a header file?
And doesn't it imply "static" linkage for the functions it contains?
> + struct EbmlProcessorEntry {
> + typedef void (*EbmlProcessor) (EbmlElement*, void*);
> +
> + EbmlId const* p_ebmlid;
> + std::type_info const* p_typeid;
> + EbmlProcessor callback;
> +
> + EbmlProcessorEntry (EbmlId const& id, std::type_info const* ti, EbmlProcessor cb)
> + : p_ebmlid (&id), p_typeid (ti), callback (cb)
> + { }
> + };
> +
> + struct ProcessorEntrySorter {
> + typedef EbmlProcessorEntry value_type;
> +
> + bool operator() (value_type const& lhs, value_type const& rhs) const {
> + EbmlId const& lid = *lhs.p_ebmlid;
> + EbmlId const& rid = *rhs.p_ebmlid;
> +
> + return lid.GetLength() < rid.GetLength() || (
> + !( rid.GetLength() < lid.GetLength() ) && lid.GetValue() < rid.GetValue()
> + );
> + }
> + };
> +
> + class EbmlTypeDispatcher : public Dispatcher<EbmlTypeDispatcher, EbmlProcessorEntry::EbmlProcessor> {
> + protected:
> + typedef std::vector<EbmlProcessorEntry> ProcessorContainer;
> +
> + public:
> + void insert (EbmlProcessorEntry const& data) {
> + _processors.push_back (data);
> + }
> +
> + void on_create () {
> + std::sort (_processors.begin(), _processors.end(), _ebml_sorter);
> + }
> +
> + bool send (EbmlElement * const& element, void* payload) const
> + {
> + EbmlProcessorEntry eb = EbmlProcessorEntry (
> + static_cast<EbmlId const&> (*element), NULL, NULL
> + );
> +
> + // --------------------------------------------------------------
> + // Find the appropriate callback for the received EbmlElement
> + // --------------------------------------------------------------
> +
> + ProcessorContainer::const_iterator cit_end = _processors.end();
> + ProcessorContainer::const_iterator cit = std::lower_bound (
> + _processors.begin(), cit_end, eb, _ebml_sorter
> + );
> +
> + if (element && cit != cit_end)
> + {
> + // --------------------------------------------------------------
> + // normally we only need to compare the addresses of the EbmlId
> + // since libebml returns a reference to a _static_ instance.
> + // --------------------------------------------------------------
> +
> + while (cit != cit_end && (cit->p_ebmlid == eb.p_ebmlid || (*cit->p_ebmlid == *eb.p_ebmlid))) {
> + std::type_info const& ti = typeid (*element);
> +
> + // --------------------------------------------------------------
> + // even though the EbmlId are equivalent, we still need to make
> + // sure that the typeid also matches.
> + // --------------------------------------------------------------
> +
> + if (*(cit->p_typeid) == ti) {
> + cit->callback (element, payload);
> + return true;
> + }
> +
> + ++cit;
> + }
> + }
> +
> + if (_default_handler == NULL)
> + return false;
> +
> + _default_handler (element, payload);
> + return true;
> + }
> +
> + public:
> + ProcessorContainer _processors;
> + static ProcessorEntrySorter _ebml_sorter;
> + };
> +
> +} /* end-of-namespace */
> +
> +#define EBML_ELEMENT_CASE_DEF(EbmlType_, ClassName_, VariableName_, InitializationExpr_) \
> + MKV_SWITCH_CASE_DEFINITION( ClassName_, EbmlType_, EbmlElement*, VariableName_, vars, \
> + InitializationExpr_, static_cast<EbmlType_&> (*data) \
> + )
> +
> +// -----------------------------------------------------------------------------------
> +// The use of `detail::typeid_ptr` below is so that we do not have to invoke "typeid"
> +// every time we are requested to do a lookup. `std::type_info` cannot be copied, so
> +// we cannot pass it by value.
> +//
> +// In C++11 you could use the hash value present inside std::type_info to do lookup,
> +// but we are stuck in C++03 and have to use the below instead.
> +// -----------------------------------------------------------------------------------
> +
> +#define E_CASE(EbmlType_, VariableName_) \
> + EBML_ELEMENT_CASE_DEF(EbmlType_, EbmlType_, VariableName_, \
> + (dispatcher.insert( EbmlProcessorEntry( EbmlType_ ::ClassInfos.ClassId(), detail::typeid_ptr<EbmlType_>(), &EbmlType_ ## _callback) ) ) \
> + )
> +
> +#define E_CASE_DEFAULT(VariableName_) \
> + EBML_ELEMENT_CASE_DEF(EbmlElement, ebml_default, VariableName_, \
> + dispatcher.set_default_handler (&ebml_default_callback) \
> + )
> +
> +#endif
> diff --git a/modules/demux/mkv/dispatcher.hpp b/modules/demux/mkv/dispatcher.hpp
> new file mode 100644
> index 0000000..2f6169e
> --- /dev/null
> +++ b/modules/demux/mkv/dispatcher.hpp
> @@ -0,0 +1,150 @@
> +/*****************************************************************************
> + * dispatcher.hpp : matroska demuxer
> + *****************************************************************************
> + * Copyright (C) 2016 Videolabs
> + * $Id$
> + *
> + * Authors: Filip Roseen <filip at videolabs.io>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License as published by
> + * the Free Software Foundation; either version 2.1 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public License
> + * along with this program; if not, write to the Free Software Foundation,
> + * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.
> + *****************************************************************************/
> +#ifndef VLC_MKV_DISPATCHER_HPP_
> +#define VLC_MKV_DISPATCHER_HPP_
> +
> +// ----------------------------------------------------------------------------
> +// This header contains helpers to simulate lambdas in C++03.
> +//
> +// It will be used to create "dispatchers" which can be thought of like a
> +// `switch` but only more dynamic.
> +// ----------------------------------------------------------------------------
> +
> +namespace {
> +
> + template<class Impl, class Processor>
> + class Dispatcher {
> + protected:
> + Dispatcher() : _default_handler (NULL) { }
For the sake of nitpicking, wouldn't it be "more generic" to default
construct _default_handler, so that is doesn't have to be a pointer type?
> + public:
> + template<class It>
> + void iterate (It beg, It end, void* const& payload) const {
> + for (; beg != end; ++beg)
> + static_cast<Impl const*> (this)->Impl::send (*beg, payload);
> + }
> +
> + void set_default_handler (Processor const& callback) {
> + _default_handler = callback;
> + }
> +
> + void on_create () {
> + /* empty default implementation */
> + }
> +
> + Processor _default_handler;
> + };
> +
> + struct DispatcherTag { };
> +
> + template<DispatcherTag*, class DispatcherType>
> + class DispatchContainer {
> + public: static DispatcherType dispatcher;
> + protected: static vlc_mutex_t _dispatcher_lock;
> + };
> +
> + template<DispatcherTag* P, class DT>
> + DT DispatchContainer<P, DT>::dispatcher;
> +
> + template<DispatcherTag* P, class DT>
> + vlc_mutex_t DispatchContainer<P, DT>::_dispatcher_lock = VLC_STATIC_MUTEX;
> +}
> +
> +// ----------------------------------------------------------------------------
> +// * `GroupName_ ## _tag` is used so that we can refer to a static dispatcher
> +// of the correct type without instantiating DispatchContainer with a
> +// locally declared type (since it is illegal in C++03).
> +//
> +// We are however allowed to pass the variable of a variable declared
> +// `extern`, and this will effectively be our handle to the "foreign"
> +// static dispatcher.
> +//
> +// * `GroupName_ ## _base` is used to declare a bunch of helpers; names that
> +// must be available in our fake "lambdas" (C++03 is a pain).
> +// ----------------------------------------------------------------------------
> +
> +#define MKV_SWITCH_CREATE(DispatchType_, GroupName_, PayloadType_) \
> + extern DispatcherTag GroupName_ ## _tag; \
> + struct GroupName_ ## _base : DispatchContainer<&GroupName_ ## _tag, DispatchType_> { \
> + typedef PayloadType_ payload_t; \
> + typedef DispatchType_ dispatch_t; \
> + typedef struct GroupName_ handler_t; \
> + static void* Payload (payload_t& data) { \
> + return static_cast<void*> (&data); \
Honest question: isn't reinterpret_cast more suited for cast to/from
void* ? It feels to me like there are basically 2 religions, but I'm
interested in your point of view on the matter :)
> + } \
> + }; \
> + struct GroupName_ : GroupName_ ## _base
> +
> +// ----------------------------------------------------------------------------
> +// * `Dispatcher` is a static function used to access the dispatcher in a
> +// thread-safe manner. We only want _one_ thread to actually construct
> +// and initialize it, hence the lock.
> +// ----------------------------------------------------------------------------
> +
> +#define MKV_SWITCH_INIT() \
> + static dispatch_t& Dispatcher () { \
> + static handler_t * p_handler = NULL; \
> + vlc_mutex_lock( &_dispatcher_lock ); \
There is a vlc_mutex_locker helper to simulate std::lock_guard you could
use here.
> + if (unlikely( p_handler == NULL) ) { \
> + static handler_t handler; \
> + p_handler = &handler; \
> + p_handler->dispatcher.on_create (); \
> + } \
> + vlc_mutex_unlock( &_dispatcher_lock ); \
> + return p_handler->dispatcher; \
> + } struct PleaseAddSemicolon {}
Also, would it make sense to have another implementation when C++11 is
enabled that takes advantage of the new memory model by only
initializing a static handler_t instead of a pointer, and getting rid of
the lock?
> +
> +// ----------------------------------------------------------------------------
> +// * The following is to be used inside `struct GroupName_`, effectivelly
> +// declaring a local struct and a data-member, `ClassName_ ## _processor`,
> +// of that type.
> +//
> +// When the data-member is constructed it will run `InitializationExpr_`,
> +// meaning that it can access the static dispatcher and register itself (or
> +// whatever is desired).
> +//
> +// * Since we need to do type-erasure, once again, because of the fact that
> +// C++03 does not support locally declared types as template-arguments, we
> +// declare a static function `ClassName_ ## _callback` that will cast our
> +// payload from `void*` to the appropriate type.
> +//
> +// * The body of `ClassName_ ## _handler` will be written by the user of the
> +// MACRO, and the implementation will effectively be invoked whenever the
> +// dispatcher decides to (through `ClassName_ ## _callback`).
> +//
> +// * Since the type of the `VariableName_` argument to `ClassName_ ## _handler`
> +// might not necessarily be the same as the type passed from the dispatcher
> +// (because of the type-erasure), the macro provides a way to change the
> +// type with a `UnwrapExpr_` (see the function call within `ClassName_ ##
> +// _callback`).
> +// ----------------------------------------------------------------------------
> +
> +#define MKV_SWITCH_CASE_DEFINITION(ClassName_, RealType_, Type_, VariableName_, PayloadName_, InitializationExpr_, UnwrapExpr_) \
> + struct ClassName_ ## _processor { \
> + ClassName_ ## _processor () { InitializationExpr_; } \
> + } ClassName_ ## __wrapper; \
> + static inline void ClassName_ ## _callback (Type_ data, void* PayloadName_) { \
> + ClassName_ ## _handler (UnwrapExpr_, *static_cast<payload_t*> (vars)); \
> + } \
> + static inline void ClassName_ ## _handler (RealType_& VariableName_, payload_t& PayloadName_)
> +
> +#endif
>
I don't think this addresses the logic of the patch, which I still have
a hard time to wrap my head around :)
I'll probably give it another go soonish
Regards,
More information about the vlc-devel
mailing list