[vlc-devel] [PATCH] Lua-SD: Adding a customized way to create Jamendo playlists for everyone to try [RESEND][USETHIS]

Tue Aug 9 17:36:04 CEST 2016

Hi `Wilawar`,

In this email I will raise my thoughts regarding your implementation.

I have already explained several of them during our discussion(s) on
`#videolan` at `freenode`, but for future reference I'd like to
include them on the mailing-list.

---------------------------------------------------------------------

Excessive comments
---------------------------------------------------------------------

> +--[[ === Comments on this Script ===
> +I realize the people at Jamendo probably want to see the playlists of featured tracks here the current
> +(2016-08-09, version 2.2.4) default script provides. I’m not averse to integrating them, they just didn’t
> +matter much to me. I will probably add them in a future version.
> +The core parsing is done by code I made a parser generator for. I did already publish its source on pastebin.com,
> +but people didn’t really like it (I’m not too much surprised by that, I wouldn’t call it beautiful myself). Part of that
> +is because of Lua’s inherent limitations, part of it because I just didn’t clean it up or comment it for others to see.
> +[Just note, there is a minor problem with code gen at the moment, the declaration of 'J' has to be moved out of
> +the do-block. Else, it should work (but it should probably be changed to use string.find instead of loops, as this
> +seems to be faster without LuaJIT despite the obvious overhead that is involved with it).]
> +The parser gen. should really only have to be run once (and I already did that), I don’t expect Jamendo to change
> +their response format anytime soon. It also relies on the specific ordering of fields and such (by design!) and would
> +break if there were any strings with quotation marks in them (it’s just stripping the escape markers).
> +
> +For the future, I’d propose to use the XML backend I also wrote, which would have to be adapted a bit to the
> +new code. It previously couldn’t work the way it should have because it was impossible to query for self-closing tags
> +(the function had just been omitted). It’s been added by refp and should work in nightlies already, just the code
> +doesn’t make use of it at the moment. However, it won’t ever work in any versions of VLC up to and including 2.2.4
> +for this reason, therefore, I propose to keep this JSON edition as long as it’ll work for old versions.
> +(I do like JSON way better than XML, but when I looked at the tools that were already there, VLC had libxml2 for XML
> +parsing, which is a C library and should be fast, but only DKJSON for JSON parsing (which is written in Lua and ought
> +to be slow for this reason). This and that the existing parser was already set for XML made me choose this format.*)
> +
> +Further improvements which could be made include those:
> +1. On-demand loading of items (should be easy to do, according to refp), which would improve usability remarkably
> +    and still probably saving Jamendo some bandwidth and server load
> +2. Displaying tracks as soon as they are avialable (Done on purpose at the end for now; it can easily be changed by
> +    shuffling around the lines in 'main'.)
> +3. Updating the playlist entries automatically or at least on users’ request (This could get hard, but right now VLC has
> +    to be closed and started again to get new lists, which is not exactly the user experience one would wish for.)
> +4. Adding ways to customize the kinds of entries to be fetched, maybe even the queries and reimplementing the
> +    merge function for multiple lists
> +5. After the first one has been implemented: Prefetching those entries users frequently use themselves
> +    (different people would probably use different entries)
> +6. Adding more tracklist choices (also only really viable after on-demand loading is there) (more orderings, more sections)
> +
> +*It’s still verbose and ugly.
> +=== End of the Comments on this Script ===]]

Comments are always nice, though if they are too verbose they
certainly do more harm than good. I understand that this is a
work-in-progress, but if this is to be merged I'd vote that the
comments should be kept to a necessary minimum.

If you'd like to explain the rationale(s) behind your implementation,
do so on the mailing list and not directly within the implementation
itself.

The above applies to several sections within your implementation.

---------------------------------------------------------------------

VLC is not a documentation aggregator
---------------------------------------------------------------------

> +--[[ === Example Query ===
> +-- Comments --
> +The line breaks are not present in the orignal answer (use 'format=jsonpretty' for that), they were added to enhance readability here.
> +The API documentation can be had at [https://developer.jamendo.com/v3.0], the documentation for the 'tracks' 'method' specifically at
> +[https://developer.jamendo.com/v3.0/tracks]. For trying stuff, use a limit like 10 or 5 or so, not 100. The client_id here is the one that
> +was given out for testing and still worked 2016-08-09. You might have to get a new one (look at the documentation for the client_id
> +request parameter, that’s where I usually found it). When I ran this query, for 100 tracks (still 2016-08-09), Firefox gave the page size
> +in the JSON view as 12,258 bytes; I’ll have to assume that this is the compressed size, as some character counting tools gave me
> +78,318 bytes for the size (and number of characters). Note that slashes are escaped for whichever reason and this has to be reversed.
> +-- Query --
> +https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&vocalinstrumental=instrumental&order=buzzrate_desc&limit=100&offset=300&type=single+albumtrack
> +-- Response --
> +{"headers":{"status":"success","code":0,"error_message":"","warnings":"","results_count":100},"results":[{"id":"716982","name":"Sunday Is Over",
> +"duration":133,"artist_id":"366099", "artist_name":"Victoria Caff\u00e8","artist_idstr":"Victoria_Caffe","album_name":"Busy Days","album_id":"83642",
> +"license_ccurl":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/2.0\/fr\/", "position":5,"releasedate":"2011-01-18","album_image":
> +"https:\/\/imgjam1.jamendo.com\/albums\/s83\/83642\/covers\/1.200.jpg", "audio":"https:\/\/mp3l.jamendo.com\/?trackid=716982&format=mp31&from=app-56d30c95",
> +"audiodownload":"https:\/\/mp3d.jamendo.com\/download\/track\/716982\/mp32\/", "prourl":"https:\/\/licensing.jamendo.com\/track\/716982","shorturl":
> +"http:\/\/jamen.do\/t\/716982","shareurl":"http:\/\/www.jamendo.com\/track\/716982", "image":"https:\/\/imgjam1.jamendo.com\/albums\/s83\/83642\/covers\/1.200.jpg"},
> +{"id":"1098631","name":"Shine (Orchestral Pop Rock Version)","duration":129,"artist_id":"361044", "artist_name":"Oursvince","artist_idstr":"Oursvince","album_name":"Shine",
> +[…]}, {[…]}, {[…]}, {[…]}, […],
> +{"id":"1075892","name":"Departure (Electro)","duration":295,"artist_id":"340768","artist_name":"Capashen","artist_idstr":"Capashen","album_name":"Departure",
> +"album_id":"128151", "license_ccurl":"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/3.0\/","position":1,"releasedate":"2013-10-30","album_image":
> +"https:\/\/imgjam1.jamendo.com\/albums\/s128\/128151\/covers\/1.200.jpg","audio":"https:\/\/mp3l.jamendo.com\/?trackid=1075892&format=mp31&from=app-56d30c95",
> +"audiodownload":"https:\/\/mp3d.jamendo.com\/download\/track\/1075892\/mp32\/","prourl":"https:\/\/licensing.jamendo.com\/track\/1075892","shorturl":
> +"http:\/\/jamen.do\/t\/1075892","shareurl":"http:\/\/www.jamendo.com\/track\/1075892","image":"https:\/\/imgjam1.jamendo.com\/albums\/s128\/128151\/covers\/1.200.jpg"}]}
> +--=== Example End === ]]
> +

If you want to make sure that people understand the syntax of the
incoming data, including a link to the API reference is far more
suitable than including a full example-response.

It is not the responsibility of code within VLC to document an
external API, especially when the documentation associated with the
Jamendo API is good enough to answer any questions regarding it.

---------------------------------------------------------------------

Auto-generated implementations are not too friendly
---------------------------------------------------------------------

> +-- The following line of code has been auto-generated
> +    local StatusString,StatusCode,ErrorMsg,Warnings,ResCnt;local J=0;local TrackIds,TrackNames,Durations,ArtistIds,ArtistNames,ArtistIdstrs,AlbumNames,AlbumIds,LicenseCcUrls,NrsOnAlbums,Releasedates_IsoStr,AlbumImageUrls,StreamingUrls,DownloadUrls,ProUrls,ShortUrls,ShareUrls,ImageUrls={},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{};do local P=1;local A;local B=string.byte;local F=string.sub;P=22+P;A=P;while B(S,P)~=34 do P=1+P end;StatusString=F(S,A,P-1);P=9+P;A=P;while B(S,P)~=44 do P=1+P end;StatusCode=0;for C=A,P-1 do StatusCode=StatusCode*10+B(S,C)-48;end;P=18+P;A=P;while B(S,P)~=34 do P=1+P end;ErrorMsg=F(S,A,P-1);P=14+P;A=P;while B(S,P)~=34 do P=1+P end;Warnings=F(S,A,P-1);P=18+P;A=P;while B(S,P)~=125 do P=1+P end;ResCnt=0;for C=A,P-1 do ResCnt=ResCnt*10+B(S,C)-48;end;P=13+P;if B(S,P)==123 then P=1+P;repeat J=1+J;P=6+P;A=P;while B(S,P)~=34 do P=1+P end;TrackIds[J]=0;for C=A,P-1 do TrackIds[J]=TrackIds[J]*10+B(S,C)-48;end;P=10+P;A=P;while B(S,P)~=34 do P=1+P end;TrackNa
 mes[J]=F(S,A,P-1);P=13+P;A=P;while B(S,P)~=44 do P=1+P end;Durations[J]=0;for C=A,P-1 do Durations[J]=Durations[J]*10+B(S,C)-48;end;P=14+P;A=P;while B(S,P)~=34 do P=1+P end;ArtistIds[J]=0;for C=A,P-1 do ArtistIds[J]=ArtistIds[J]*10+B(S,C)-48;end;P=17+P;A=P;while B(S,P)~=34 do P=1+P end;ArtistNames[J]=F(S,A,P-1);P=18+P;A=P;while B(S,P)~=34 do P=1+P end;ArtistIdstrs[J]=F(S,A,P-1);P=16+P;A=P;while B(S,P)~=34 do P=1+P end;AlbumNames[J]=F(S,A,P-1);P=14+P;A=P;while B(S,P)~=34 do P=1+P end;AlbumIds[J]=0;for C=A,P-1 do AlbumIds[J]=AlbumIds[J]*10+B(S,C)-48;end;P=19+P;A=P;while B(S,P)~=34 do P=1+P end;LicenseCcUrls[J]=F(S,A,P-1);P=13+P;A=P;while B(S,P)~=44 do P=1+P end;NrsOnAlbums[J]=0;for C=A,P-1 do NrsOnAlbums[J]=NrsOnAlbums[J]*10+B(S,C)-48;end;P=16+P;A=P;while B(S,P)~=34 do P=1+P end;Releasedates_IsoStr[J]=F(S,A,P-1);P=17+P;A=P;while B(S,P)~=34 do P=1+P end;AlbumImageUrls[J]=F(S,A,P-1);P=11+P;A=P;while B(S,P)~=34 do P=1+P end;StreamingUrls[J]=F(S,A,P-1);P=19+P;A=P;while B(S,P)~=34 do P=1+P
  end;DownloadUrls[J]=F(S,A,P-1);P=12+P;A=P;while B(S,P)~=34 do P=1+P end;ProUrls[J]=F(S,A,P-1);P=14+P;A=P;while B(S,P)~=34 do P=1+P end;ShortUrls[J]=F(S,A,P-1);P=14+P;A=P;while B(S,P)~=34 do P=1+P end;ShareUrls[J]=F(S,A,P-1);P=11+P;A=P;while B(S,P)~=34 do P=1+P end;ImageUrls[J]=F(S,A,P-1);P=2+P;if B(S,P)==44 and B(S,P+1)==123 then P=2+P;else break;end;until false;end;P=1+P;end;

The above line is **2373** characters long, even for auto-generated
content it is not too pleasant to look at, nor interact with.

As stated in `#videolan` at `freenode`, I strongly feel like your
approach to parsing the data is too complex to be a viable solution to
the problem you have described.

If you are trying to write an implementation that is as fast as
possible, and this being your rationale; include a benchmark showing
why your implementation is more suitable than a far less complex
alternative using existing tools that can be used to parse *json*,
*xml*, etc.

    - What has been used to generate the code responsible for parsing?
    - If a bug is discovered, how do you recommend fixing it without
      access to the code-generation?

---------------------------------------------------------------------

Do not be afraid to use whitespace
---------------------------------------------------------------------

> +-- The previous line of code has been auto-generated
> +    --@ AfterParsing = os.clock()
> +    --@ log("Reading took " .. tostring(AfterReading_BeforeParsing - BeforeReading) .. "ms, while parsing took " ..
> +    --@	tostring(AfterParsing - AfterReading_BeforeParsing) .. "ms to complete.")
> +    log("Status code: " .. StatusCode) --These four log items are optional debugging aids
> +    log("Warnings: " .. Warnings)
> +    log("Error messages: " .. ErrorMsg)
> +    log("Results count: " .. ResCnt)
> +    local TrackRes = {}
> +    for TrackResNbr = 1, J do --J will hold the number of tracks that were encountered and hence the number of entries every table will have
> +        TrackRes[TrackResNbr] = {["title"] = TrackNames[TrackResNbr], ["artist"] = ArtistNames[TrackResNbr], ["copyright"] = LicenseCcUrls[TrackResNbr],
> +            ["album"] = AlbumNames[TrackResNbr], ["path"] = StreamingUrls[TrackResNbr], ["date"] = Releasedates_IsoStr[TrackResNbr],
> +            ["arturl"] = ImageUrls[TrackResNbr], ["url"] = DownloadUrls[TrackResNbr], ["nowplaying"] = ShareUrls[TrackResNbr], ["duration"] = Durations[TrackResNbr],
> +            ["tracknum"] = NrsOnAlbums[TrackResNbr], ["trackid"] = TrackIds[TrackResNbr]
> +            --[=[, ["description"] = "Placement test: description", ["setting"] = "Placement test: setting",
> +            ["rating"] = "Placement test: rating", ["name"] = "Placement test: name"--]=]}
> +            --The parameter values for 'vocalinstrumental' and 'order' could be put into the field Genre

The above is not auto-generated, but it is nonetheless not very
readable; introducing some white-space certainly would not hurt the
implementation.

----------------------------------------------------------------------

Refactoring to make things more readable
----------------------------------------------------------------------

> +function main()
> +    local OuterBundle = vlc.sd
> +    local TopTracks_Buzzrate, TopTracks_PlyWeek, TopTracks_PlyMonth, TopTracks_TkId, TopInstTks_Buzzrate, TopInstTks_PlyWeek, TopInstTks_PlyMonth, TopInstTks_TkId
> +    local LowTracks_Buzzrate, LowTracks_PlyWeek, LowTracks_PlyMonth, LowTracks_TkId, LowInstTks_Buzzrate, LowInstTks_PlyWeek, LowInstTks_PlyMonth, LowInstTks_TkId
> +    local TopTracks_Buzzrate_Ct, TopTracks_PlyWeek_Ct, TopTracks_PlyMonth_Ct, TopTracks_TkId_Ct, TopInstTks_Buzzrate_Ct, TopInstTks_PlyWeek_Ct, TopInstTks_PlyMonth_Ct, TopInstTks_TkId_Ct
> +    local LowTracks_Buzzrate_Ct, LowTracks_PlyWeek_Ct, LowTracks_PlyMonth_Ct, LowTracks_TkId_Ct, LowInstTks_Buzzrate_Ct, LowInstTks_PlyWeek_Ct, LowInstTks_PlyMonth_Ct, LowInstTks_TkId_Ct
> +    TopTracks_PlyWeek, TopTracks_PlyWeek_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=popularity_week&limit=100"))
> +    TopTracks_PlyMonth, TopTracks_PlyMonth_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=popularity_month&limit=100"))
> +    TopTracks_TkId, TopTracks_TkId_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=id&limit=100"))
> +    TopTracks_Buzzrate, TopTracks_Buzzrate_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=buzzrate_desc&limit=100"))
> +    TopInstTks_PlyWeek, TopInstTks_PlyWeek_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=popularity_week&limit=100"))
> +    TopInstTks_PlyMonth, TopInstTks_PlyMonth_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=popularity_month&limit=100"))
> +    TopInstTks_TkId, TopInstTks_TkId_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=id&limit=100"))
> +    TopInstTks_Buzzrate, TopInstTks_Buzzrate_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=buzzrate_desc&limit=100"))
> +    LowTracks_PlyWeek, LowTracks_PlyWeek_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=popularity_week&limit=100&offset=300"))
> +    LowTracks_PlyMonth, LowTracks_PlyMonth_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=popularity_month&limit=100&offset=300"))
> +    LowTracks_TkId, LowTracks_TkId_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=id&limit=100&offset=300"))
> +    LowTracks_Buzzrate, LowTracks_Buzzrate_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&order=buzzrate_desc&limit=100&offset=300"))
> +    LowInstTks_PlyWeek, LowInstTks_PlyWeek_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=popularity_week&limit=100&offset=300"))
> +    LowInstTks_PlyMonth, LowInstTks_PlyMonth_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=popularity_month&limit=100&offset=300"))
> +    LowInstTks_TkId, LowInstTks_TkId_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=id&limit=100&offset=300"))
> +    LowInstTks_Buzzrate, LowInstTks_Buzzrate_Ct = ExtractingTheTrackInfo(vlc.stream(
> +        "https://api.jamendo.com/v3.0/tracks/?client_id=56d30c95&format=json&type=single+albumtrack&vocalinstrumental=instrumental&order=buzzrate_desc&limit=100&offset=300"))

16 invocations of `ExtractingTheTrackInfo` that have *a lot* in
common; why not introduce a helper-function so that each call would
only include what is specific to the relevant request?

----------------------------------------------------------------------

Use the tools available in the language
----------------------------------------------------------------------

> +    --Adding the content to the playlist only at the end should make it painfully obvious how long it’s taking to fetch stuff right now.
> +    --This could be much improved by loading only nodes that were requested by the user, time will tell when this will be implemented.
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Highest popularity for one week"}), "JATW", TopTracks_PlyWeek, TopTracks_PlyWeek_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Less popular for one week by 300 positions"}), "JAWL", LowTracks_PlyWeek, LowTracks_PlyWeek_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Highest popularity for one month"}), "JAMT", TopTracks_PlyMonth, TopTracks_PlyMonth_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "300 positions less popular for one month"}), "JAML", LowTracks_PlyMonth, LowTracks_PlyMonth_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Highest Track-IDs"}), "JANT", TopTracks_TkId, TopTracks_TkId_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Track-IDs 300 down from the top"}), "JANL", LowTracks_TkId, LowTracks_TkId_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Tracks with the highest buzzrate"}), "JABT", TopTracks_Buzzrate, TopTracks_Buzzrate_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Tracks which are 300 positions down on the buzzrate scale"}), "JABL", LowTracks_Buzzrate, LowTracks_Buzzrate_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental and highest popularity this week"}), "JIWT", TopInstTks_PlyWeek, TopInstTks_PlyWeek_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental and less popular this week by 300 positions"}), "JIWL", LowInstTks_PlyWeek, LowInstTks_PlyWeek_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental and Highest popularity this month"}), "JIMT", TopInstTks_PlyMonth, TopInstTks_PlyMonth_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental and 300 positions less popular this month"}), "JIML", LowInstTks_PlyMonth, LowInstTks_PlyMonth_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Highest Track-IDs | Instrumental"}), "JINT", TopInstTks_TkId, TopInstTks_TkId_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Track-IDs 300 down from the top | Instrumental"}), "JINL", LowInstTks_TkId, LowInstTks_TkId_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental | InstTks with the highest buzzrate"}), "JIBT", TopInstTks_Buzzrate, TopInstTks_Buzzrate_Ct)
> +    AddingATracklistInThreeSlices(OuterBundle.add_node({["title"] = "Instrumental | InstTks which are 300 positions down on the buzzrate scale"}), "JIBL", LowInstTks_Buzzrate, LowInstTks_Buzzrate_Ct)
> +    --Another format proposal "I: Tracks with the highest buzzrate (Instrumental)"

An array with suitable content and a loop would make the above so much
less verbose.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20160809/fb0a8a97/attachment.html>