[vlc-devel] [PATCH 4/5] stream_ReadLine: support arbitrary length limit

Pierre Ynard linkfanel at yahoo.fr
Mon Sep 7 13:45:56 CEST 2020


> No way. The caller is hardly in better position to know a sane limit
> than the callee. This just ends up replacing an arbitrary limit with
> dozens of different inconsistent arbitrary limits.

I'm going to assume that you're referring again to what you mentioned
last time:

> Unfortunately, the practical limit depends on a lot of factors, most
> notably the final "consumer" rather than the caller, e.g. the GUI
> showing a property of an input item extracted from a playlist file.

But that doesn't work. This is backwards and broken to put that
responsibility on an upstream middle piece of the data supply that
doesn't know the constraints of the consumer, and doesn't even
understand what's flowing through it. This sounds a lot like an
anti-pattern. The consumer should take responsibility for validating or
formatting its input before using or presenting it.

Then you don't actually name what could go wrong, so this sounds a bit
like FUD. In fact this idea of practical limit is already broken, or
is a non-issue (probably both). The current arbitrary limit isn't any
consistent one on the consumed data or metadata, it's only a vague
blanket upper bound that completely fails at consistently limiting
fields as diverse as a genre or year tag (a few characters), a URL (some
say up to 2 kB), or a description (potentially much longer).

Also with the current blanket 200 kB limit, it can already pass
200-kB-long title strings or URLs to the GUI, video output or playlist,
so it is already either broken or a non-issue; the security implications
make me assume that this part is already properly handled by consumers,
and thus a non-issue.

Then stream_ReadLine() can't possibly be a proper checkpoint to control
this as there are plenty other channels from where consumers get this
data: binary file formats, with specifications controlling these field
lengths; users of the XML parser module, itself calling upon the
external, general-purpose libxml2 to parse fields; lua playlist scripts,
some of which (out of tree) even start by slurping the whole file before
beginning to parse it, and have no limit on what metadata they get and
return from that; and also out-of-tree modules calling core APIs in an
uncontrolled way.

Your arguments are even self-defeating here: you say the hardcoded
limit in stream_ReadLine() is crucial for consistently keeping things
practical, yet you're pushing for lua scripts NOT using it and doing
whatever the hell they want instead!

So this really makes no sense to me at all. This point of view you take
makes no sense to me, but even considering things from there, your
reasoning is invalid, and the caller is in a much better position to
wield limits, as contrary to the callee, it actually understands what
data it's dealing with, and performs the field extraction, so if it was
its responsibility it would be able to enforce limits on the actual
fields, and in an actually adapted, proportionate and less arbitrary
manner.

It makes even less sense to me when instead of looking towards the
consumer, you look towards the source of data. There it is very obvious
that the caller, knowing what kind of input data and format to expect,
is in a much better position to know not just sane, but appropriate and
practical limits.

You argued yourself that Kate limits lines at 4096 characters; I don't
know if you mean that it's an implementation specific or part of the
specs, but indeed as I agreed above, there can be such specification
or formats constraints, in which case the caller implementing that
particular format or use case knows *exactly* what the limit has to be
- and then this known specified limit isn't arbitrary at all for the
caller.

The core stream API is general-purpose and many data formats flow
through it: the various kinds of data of course aren't consistent,
it isn't supposed to be consistent, and so the current "consistent"
one-size-fits-all is inopportune. On the contrary, the point of the
patch is to make it configurable, which is indeed the very opposite of
consistent, in order to adapt it to the knowledge of the format, and
thus make it less arbitrary.

-- 
Pierre Ynard
"Une âme dans un corps, c'est comme un dessin sur une feuille de papier."


More information about the vlc-devel mailing list