[vlc-devel] [RFC 0/2] New executor API

Fri Aug 28 18:21:22 CEST 2020

Hi,

On Fri, Aug 28, 2020 at 05:31:02PM +0200, Romain Vimont wrote:
> On Fri, Aug 28, 2020 at 04:51:35PM +0300, Rémi Denis-Courmont wrote:
> > Le perjantaina 28. elokuuta 2020, 11.25.16 EEST Alexandre Janniaux a écrit :
> > > > > > Besides, timing out is an error that deserves logging. Interrupt is
> > > > > > not.
> > > > > > So using the same callback for both purpose looks dubious to me.
> > > > >
> > > > > The goal is simplicity.
> > > >
> > > > By making things more complicated. Yeah right...
> > >
> > > I agree with Remi, if you want simplicity, the best thing
> > > would be to limit the asynchronous framework to the
> > > asynchronous execution and signalling like almost every other
> > > asynchronous framework.
> > >
> > > Especially since youǘe talked about some `future` object, and
> > > thus a pro-actor model which doesn't have ways to do
> > > signalling from the outside of the asynchronous framework to
> > > the task by itself in almost all implementation, and instead
> > > have each layer of promise being able to return either a result
> > > or an error that the future will catch, and an internal
> > > composition rule for each layer's result.
> > >
> > > In the case the vlc_executor is killed, the preparser client
> > > needs to be killed first since it is using the executor, and
> > > thus can cancel its tasks internally directly, which might be
> > > at the middle of the preprocessing, at the beginning or right
> > > before the end. It's much less convoluted than having the
> > > preparser client use the vlc_executor to cancel the taks, which
> > > will just really call the preparser specific cancellation code
> > > anyway.
> >
> > Yes exactly. All the complexity comes from the very disagreeable decision to
> > have the runner rather than the owner keep track of the tasks. If the owner
> > keeps track of its tasks, it can interrupt them (due to stop or time-out)
> > directly, and the tasks can notify their complexition directly as well.
>
> That's more or less what I quickly implemented on:
> https://code.videolan.org/rom1v/vlc/commits/executor.18
> https://mailman.videolan.org/pipermail/vlc-devel/2020-August/136784.html
>
> (the on_finished() callback could also be removed)
>
> The problem is that clients have to do a lot of (duplicate) work on
> their own. The initial purpose was to replace background worker. A
> minimal executor like this would only provide a small part of what
> preparser/thumbnailer need.
>
> So if we do this, either we also implement a higher-level tool which
> simplifies timeout and cancelation (vlc_higherlevel_executor_t ;)) or
> every client just reimplement the whole timeout and cancelation on their
> own. :/ (it's less helpful than background worker)

I don't know Remi's stance, but if the goal is to simplify
the background worker code, it's probably not to bring each
and every features it had ad-hoc in the new code.

I'm not against a higher-level tool if it proves to be
needed but again I don't understand the «must reimplement
the whole timeout and cancellation on their own» point
since you need to implement it anyway in both case for
the task.

> > It's not the first time that somebody implements a background runner in open-
> > source C code, e.g. <linux/workqueue.h>. All you need is:
> > - allocate runner,
> > - queue runnable to the runner,
> > - dequeue (cancel) runnable,
> > - wait for quiescent state of runnable,
> > - wait for quiescent state of whole runner,
> > - destroy runner.
> >
> > Only the first function can fail and needs error handling. A runnable is just a
> > single callback and either a pointer to itself or an opaque data pointer.
>
> The executor must be able to queue runnables anyway (we want to be able
> to queue 100 preparsing requests even if there are 4 threads). So if the
> caller just provides a runnable (a run callback) and a userdata pointer,
> then the executor must be able to allocate something on _Submit() (so it
> may fail) or limit the queue size (so it may also fail). Unless the
> client also provides a structure with some data private to the executor
> (e.g. a vlc_list node), but that looks ugly.

The allocation could be done on the client side, potentially with
a private data part, before being queued to the executor. Then the
allocation can be expected to fail and signals the developer it
must do the usual required checks.

Something like,

    struct vlc_task * task =
        vlc_task_Create(sizeof private_data, runner_function);

Or maybe private data can just be attached as a void* anyway.

> Moreover, if we don't want to spawn all the max_threads threads on
> creation, we would like to spawn a thread as necessary on _Submit() (so
> that may fail). IMO, it's not a problem that _Submit() returns an error
> code.

I'm not sure there is a reason for which you would like to
avoid spawning the thread at creation, since you're probably
not supposed to spawn an executor and do nothing with it.

By having only the smallest number of thread created for the
current tasks, you increase the latency of startup, potentially
for each task if the asynchronous code doesn't pipeline correctly.

Regards,
--
Alexandre Janniaux
Videolabs