[vlc-devel] [PATCH 4/5] stream_ReadLine: support arbitrary length limit

Pierre Ynard linkfanel at yahoo.fr
Tue Sep 8 15:56:43 CEST 2020


> Thus, in most cases, you don't actually need the lines, and in some of
> the cases (like anevia_streams), reading line by line could break the
> parsing because an item could be written like
>
>    <input
>       name="..">
>
> or
>
>    <INPUT NAME="...">
>
> or
>
>    < input name="" >
>
> etc, while a more specialized parser would not fail on such issue.

Yes indeed, that is true. However as I said previously, the reality is
that in my experience it's never been a practical issue. Even when that
happens, it's always possible to manually call readline() again after
anchoring on the beginning of the tag; some scripts do that.

> Thus, we should probably design better parsing primitives for the
> lua modules like Thomas suggested. To do that, we need to define the
> different needs of the parsers, including for example:
>
>  - XML element-based parsing
>  - JSON properties in a document
>  - regex in which you can stream content and trigger when it matches

The regex idea seems like it could have potential. That's a bit what's
used to parse the javascript for YouTube URL signature descrambling.

> Maybe we can refocus a discussion thread on that if you agree with me
> (us?) to get back to a productive and constructive result?
>
> I can provide some work for the implementation and some time for the
> discussion if you'd like.

We'd need to be able to plug the lua script's main stream object into
the XML parser. Maybe that stream object itself can be exposed to lua
scripts, or it can be offered internally by the lua bindinds. Also,
maybe that can be the occasion to factor some of the stream read code
between modules/lua/stream_filter.c and modules/lua/libs/stream.c

We might also want a JSON parser in which you can plug a VLC stream
object. A query string / application/x-www-form-urlencoded data parser
could be offered too, but I'm not sure that it's too frequent apart from
the YouTube /get_video_info API.

JSON is a proper data exchange format, and technically XML is too,
but the HTML pages that lua scripts deal with aren't, they are filled
with heaps of variable useless stuff. If we're serious about scripts
navigating documents as XML, we'd need to offer something like XPath.

-- 
Pierre Ynard
"Une âme dans un corps, c'est comme un dessin sur une feuille de papier."


More information about the vlc-devel mailing list