[vlc-devel] [RFC 0/3] Improving vlc_array_t

Sun Jul 22 14:29:32 CEST 2018

Le samedi 21 juillet 2018, 23:01:25 EEST Romain Vimont a écrit :
> > > > But TBH, it is quite seldom that we do index-based random access.
> > > > Usually,
> > > > it's either sequential access or look-up by some key.
> 
> Random access is sometimes needed though. For example, for random
> playing, selecting the next item involves swapping two items by index
> (e.g. Fisher–Yates algorithm).

I would argue that fast insertion and fast deletion of item is more important 
than fast shuffled iteration. So that makes an array a terrible choice. And we 
need to support insertion/deletion by index, as well as by sorting criterion.

There is probably no existing data structure in VLC that can support a (large) 
playlist with good performance. And I expect that any adequate structure would 
be hard to generalize beyond the playlist.

In other words, improving the array helpers should have nothing to do with the 
playlist. You can do either or both, but not design/benchmark the array 
helpers by playlist shuffle.

> > > I start working on some playlist/shuffle stuff, and my first choice
> > > is to use a vector-like tool. Maybe in the end, for some reason, I will
> > > switch to another structure, but a vector-like API would still be useful
> > > IMO.
> > 
> > An array is _not_ a good choice for the playlist given the large number of
> > potential items, and the fact that it sees potentially a lot of random
> > insertions and deletions.
> 
> What would you suggest for the playlist?

I don't think we have a suitable data structure for the playlist.

It should be possible to optimize everything with an ad-hoc tree structure, 
but not with the existing search tree, and obviously not with an array neither 
a linked list.

> > > At the time when the playlist code was written, there were no other data
> > structure helpers, but that is not an excuse for reproducing the same
> > mistake again. Besides, the array is primarily for looking up items by
> > ID, something that should go away in favor of proper reference counting.
> 
> Could you explain what you mean (replacing lookup by ID by reference
> counting)?

I mean that using playlist item ID is a poor excuse for a (weak) reference to 
a playlist tree node. But given that the playlist tree layout has been put on 
death row, playlist_item_t will go away anyhow.

(...)
> No, they are not for _sorted_ arrays only (the items do not even have to
> be comparable, unless for cases where BSEARCH is used).

Yes, they are sorted arrays. The current insertion and removal functions/
macros for all three helper sets maintain the order using memmove(). So 
insertion and deletion is linear regardless of memory (pre)allocation.

There are certainly use cases not relying on that feature, but some of them 
do, so the semantics cannot be changed. Thus, you can only optimize append - 
if the C run-time does not already make the optimization within realloc() - 
tail-delete and an hypothetical new swap-delete.

My problem is that, so far, I am not aware of a use case for that optimization 
that would not be better optimized with a linked list or a tree instead of an 
array.

-- 
Rémi Denis-Courmont