[vlc-devel] [RFC 0/3] Improving vlc_array_t
Rémi Denis-Courmont
remi at remlab.net
Sat Jul 21 07:18:25 CEST 2018
Le vendredi 20 juillet 2018, 14:29:07 EEST Romain Vimont a écrit :
> On Fri, Jul 20, 2018 at 01:44:25PM +0300, Rémi Denis-Courmont wrote:
> > Le jeudi 19 juillet 2018, 18:57:15 EEST Romain Vimont a écrit :
> > > There are 3 arrays API in vlc_arrays.h:
> > > 1. TAB_* macros
> > > 2. ARRAY_* macros
> > > 3. vlc_array_t + functions
> > >
> > > When we want a dynamic array of pointers, the vlc_array_t seems more
> > > appropriate.
> >
> > The benefit of the TAB_* macros is that they are type-agnostic.
> > vlc_array_t
> > does not accomodate all the TAB_* use cases, and even when it does, it
> > sacrifices type-safety.
> >
> > That is not to say that TAB_* is good. Not only it lacks error handling as
> > you noted, but in many cases, another structure, e.g. singly linked list,
> > doubly linked list, or binary tree would work better than an array.
> >
> > > Moreover, the macro versions abort() on allocation failure.
> >
> > Yes but vlc_array_t comes with its own issues, such as type safety. And
> > handling arrays manually is often easy enough that we have a lot of
> > hand-coded type-safe _and_ error-safe implementation (not only from me).
>
> In short, they all have benefits and drawbacks. We agree.
We agree that the helpers all have drawbacks.
In higher level languages with templates/generics, one can write nice vector
data structures (though that would probably come with the standard library).
But in C, the normal way to handle vectors is a typed pointer and an unsigned
counter (or an error-value terminator). And it turns out that VLC actually
does that a lot - just not with the hopeless helpers.
IMO, what we really should have are helpers for allocating and reallocating an
array (using mul_overflow) while exposing errors where unavoidable, and not
losing type safety.
> > > To make it more user-friendly, I suggest to rename:
> > > - vlc_array_item_at_index() by vlc_array_get() (patch 1)
> > > - vlc_array_index_of_item() by vlc_array_find() (patch 2)
> >
> > TBH, if you need to look-up an item by index, you are probably using the
> > wrong data structure. Binary tree is probably better suited (logarithmic
> > time). Or if the lookup is only for random deletion, then doubly linked
> > list is probably even better suited (constant time).
>
> It depends. For example, if you often use random access in O(1), and
> rarely need to find an item, it may be appropriate. There are often
> trade-offs depending on the specific case.
If you often look up by random index-based access, and rarely insert/delete,
then by all means use a table. My point is that many (probably most) of the
TAB_* and vlc_array_t use cases fall outside of that scenario.
But TBH, it is quite seldom that we do index-based random access. Usually,
it's either sequential access or look-up by some key.
> But at least the basic structures and common functions must be exposed,
> so that we can choose what is more appropriate.
And that's basically never vlc_array_t because it's not type safe.
> > > To make it more general (suitable for more cases), I suggest to change
> > > its growth strategy so that we can append items in amortized constant
> > > time (similar to what ARRAY_* macros use, or, FWIW, C++ std::vector,
> > > Java ArrayList, Rust std::vec, etc.). (patch 3)
> >
> > For potential large datasets, we already have trees supporting logarithmic
> > time for insertion, look-up, replacement and deletion, and linear time for
> > enumeration.
> >
> > All that with only bound object sizes.
> >
> > So I'm really unconvinced that we should push vlc_array_t.
>
> IIUC, you are saying that since there are trees, we don't need vectors?
No. I am saying that if you are concerned about element insertion and/or
deletion time, you really should not use an array to begin with, because
performance with an array will definitely suck.
Insertion and deletion in an array will always be linear (assuming that heap
allocation is linear or less). The only thing you can hope to optimize is
appending and tail deletion - and even then a LIFO is usually a better choice.
--
Rémi Denis-Courmont
More information about the vlc-devel
mailing list