[PATCH] D34584: Introduce new JSON import format
Kristof Beyls via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 27 00:02:58 PDT 2017
kristof.beyls added a subscriber: leandron.
kristof.beyls added a comment.
Thank you so much for this, Matthias!
Some questions/remarks inline:
> This redesigns to get a new simpler import format for LNT. With plans to
> use the same format for exporting.
>
> Go for lowercase keys without spaces.
Makes sense to me.
> Drop Run/Info/tag: json itself is a schemaless format so it feels more natural without one. The LNT testsuite type is now determined by the submit URL used, or the '-s' commandline flag on the commandline. Both default to 'nts' if omitted.
I don't fully understand how LNT reacts to this, but on the surface makes sense to me. If someone ever had time to do a major redesign of the database schema, there is a chance we might also be able to drop the need for differences in the database schema per test suite, like "nts"? Anyway, that is definitely far out of scope for this patch.
> Drop the renaming where we would use different names for import ("info_key") than inside the lnt database. Just expect the lnt database names.
Again, I don't understand fully why info_key used to be needed, but this seem a good change to me.
> Simplify the format by flattening Machine/Info into Machine and Run/Info into Run.
Looks sensible.
> Previously:
>
> "Machine": {
>
> "name": "mymachine",
> "Info": {
> "hardware": "HAL 9000"
> }
>
> },
> "Run": {
>
> "Start Time": "2017-05-16 21:31:42",
> "End Time": "2017-05-16 21:31:42",
> "Info": {
> "__report_version__": "1",
I don't know what __report_version__ is used for today, but would keep on having some kind of "report_version" key still help if in the future we'd want to change this format again? Having a version string in there could make it easier to support multiple versions?
> "tag": "nts",
> "run_order": "123456"
> }
>
> }
> Now:
>
> "machine": {
>
> "name": "mymachine",
> "hardware": "HAL 9000"
>
> },
> "run": {
>
> "start_time": "2017-05-16 21:31:42",
> "end_time": "2017-05-16 21:31:42",
> "llvm_project_revision": "123456"
Since I see LNT being used also for tracking performance for non-LLVM projects, I think it'd be best to drop "llvm_" from "llvm_project_revision".
> }
> Reorganize test data, so we have 1 record for each tests containing all the metrics.
Thanks, that will make the json easier to read by a human, when that is needed.
> Drop the "Info" record for tests (the v4db would refuse to take non-empty Info records anyway).
> Again no renaming of test metrics from "info_key" anymore.
> Previously:
>
> "Tests": [
>
> { "Data": [ 7, 42 ],
> "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.compile",
> "Info": {}
> },
> { "Data": [ 13 ],
> "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.exec",
> "Info": {}
> },
> { "Data": [ 11 ],
> "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.exec",
> "Info": {}
> },
> { "Data": [ "49333a87d501b0aea2191830b66b5eec" ],
> "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.hash",
> "Info": {}
> }
>
> ]
> Now:
>
> "test": [
>
> {
> "name": "SingleSource/Benchmarks/Adobe-C++/functionobjects",
> "compile_time": [ 7, 42 ],
> "execution_time": [ 13, 11 ],
> "hash": "49333a87d501b0aea2191830b66b5eec"
> }
>
> ]
> Implements upgrade logic for submissions in the old format
I guess we can encounter both clients submitting the new format to older servers and older clients submitting the old format to newer servers.
Out of these combinations, I guess we can only support old format submitted to newer servers, not the other way around, right?
If so, it'll be important to make sure the lnt.llvm.org is operating well and all public bots currently submitting data to llvm.org/perf have switched to submitting lnt.llvm.org before this lands as otherwise the bots will start submitting this new format and the very out-of-date LNT server at llvm.org/perf will not be able to cope?
Although that being said, it seems the llvm.org/perf service is down since last weekend's maintenance works on llvm.org, and maybe we just have to switch submitting to lnt.llvm.org anyway.
@cmatthews: Do you think the version of the server at lnt.llvm.org could be upgraded pretty much immediately after this lands to make sure we don't miss performance reports from bots that automatically upgrade to ToT LNT?
> I will submit a separate patch to remove the "info_key" columns from the LNT database.
> I will submit a separate patch to have the runtest tests submit in the new format.
Ah right, here we can leave some time between the server accepting the new format and the clients starting to submit the new format, to allow servers to be upgraded.
> This has been discussed with Chris Matthews to make sure the new REST export APIs will spit out the same format, so users only need to learn one format and it becomes easier to move data around servers.
================
Comment at: docs/importing_data.rst:46
+ // mandatory. For the 'nts' and 'Compile' schemas this is the
+ // 'llvm_project_revision' field.
+ },
----------------
As said in the long general comment, I'd suggest changing "llvm_project_revision" to "project_revision". Or maybe just "revision"?
================
Comment at: docs/importing_data.rst:109-116
- * ".exec": represents execution time - a lower number is better.
- * ".exec.status": a non zero value represents a program failure.
- * ".score": represent a benchmark score - a higher number is better.
- * ".code_size": represents the code size of a program.
- * ".hash": represents a hash of the binary program being executed. This is
- used to detect if between compiler versions, the generated code has
- changed.
----------------
I think the documentation of the different kinds of data that is interpreted by LNT should remain in the documentation in one form or another. It is essential information to transform test data into this format.
Repository:
rL LLVM
https://reviews.llvm.org/D34584
More information about the llvm-commits
mailing list