[PATCH] D34584: Introduce new JSON import format

Tue Jun 27 14:38:51 PDT 2017

BTW, we I can update lnt.llvm.org <http://lnt.llvm.org/> really fast, when needed.

> On Jun 27, 2017, at 2:36 PM, Matthias Braun via Phabricator <reviews at reviews.llvm.org> wrote:
> 
> MatzeB added a comment.
> 
> In https://reviews.llvm.org/D34584#791693, @kristof.beyls wrote:
> 
>> Thank you so much for this, Matthias!
>> 
>> Some questions/remarks inline:
>> 
>>> This redesigns to get a new simpler import format for LNT. With plans to
>>> use the same format for exporting.
>>> 
>>> Go for lowercase keys without spaces.
>> 
>> Makes sense to me.
>> 
>>> Drop Run/Info/tag: json itself is a schemaless format so it feels more natural without one. The LNT testsuite type is now determined by the submit URL used, or the '-s' commandline flag on the commandline. Both default to 'nts' if omitted.
>> 
>> I don't fully understand how LNT reacts to this, but on the surface makes sense to me. If someone ever had time to do a major redesign of the database schema, there is a chance we might also be able to drop the need for differences in the database schema per test suite, like "nts"? Anyway, that is definitely far out of scope for this patch.
> 
> 
> The patch works fine. People using non-nts testsuites (which I have a strong suspicion only exist within apple), may need to supply a `-s MySchema` flag for the commandline tools or use `db_default/v4/MySchema/submitRun` instead of `db_default/submitRun` to express they don't use the 'nts' default.
> 
> This is probably not the right patch to discuss LNT database design. But some points anyway:
> 
> - IMO making SQL tables too generic (like having one row with MetricType, Test, Run and Value; so we could create new MetricTypes on demand) is a sure way to kill performance. As long as we use SQL we should have 1 table column for each metric.
> - A schema is nice when you need to decide bits like the unit of a metric, whether bigger is better or getting a human readable metric name for the website. However we could make good arguments that we only need that on the presentation/analysis side and not as part of the data storage model.
> 
>> 
>> 
>>> Drop the renaming where we would use different names for import ("info_key") than inside the lnt database. Just expect the lnt database names.
>> 
>> Again, I don't understand fully why info_key used to be needed, but this seem a good change to me.
> 
> Same here.
> 
>> 
>> 
>>> Simplify the format by flattening Machine/Info into Machine and Run/Info into Run.
>> 
>> Looks sensible.
>> 
>>> Previously:
>>> 
>>> "Machine": {
>>> 
>>>  "name": "mymachine",
>>>  "Info": {
>>>    "hardware": "HAL 9000"
>>>  }
>>> 
>>> },
>>> "Run": {
>>> 
>>>  "Start Time": "2017-05-16 21:31:42",
>>>  "End Time": "2017-05-16 21:31:42",
>>>  "Info": {
>>>    "__report_version__": "1",
>> 
>> I don't know what __report_version__ is used for today, but would keep on having some kind of "report_version" key still help if in the future we'd want to change this format again? Having a version string in there could make it easier to support multiple versions?
> 
> I talked a bit with Chris about this. And I think we want to persue a somwhat weaker version scheme were tools add something like "generated": "LNT version 0.4.1" or "generated": "llvm test-suite runner 0.1", similar in spirit to the http user agent header.
> 
>> 
>> 
>>>    "tag": "nts",
>>>    "run_order": "123456"
>>>  }
>>> 
>>> }
>>> Now:
>>> 
>>> "machine": {
>>> 
>>>  "name": "mymachine",
>>>  "hardware": "HAL 9000"
>>> 
>>> },
>>> "run": {
>>> 
>>>  "start_time": "2017-05-16 21:31:42",
>>>  "end_time": "2017-05-16 21:31:42",
>>>  "llvm_project_revision": "123456"
>> 
>> Since I see LNT being used also for tracking performance for non-LLVM projects, I think it'd be best to drop "llvm_" from "llvm_project_revision".
> 
> - The name for the field is determined by the schema in LNT, so not necessarily part of this patch.
> - "project_revision" seems to generic to me (I wouldn't know whether it's the compiler or the benchmark suites revision). I could probably live with "compiler_revision", but we should have a separate patch for that.
> 
>> 
>> 
>>> }
>>> Reorganize test data, so we have 1 record for each tests containing all the metrics.
>> 
>> Thanks, that will make the json easier to read by a human, when that is needed.
>> 
>>> Drop the "Info" record for tests (the v4db would refuse to take non-empty Info records anyway).
>>> Again no renaming of test metrics from "info_key" anymore.
>>> Previously:
>>> 
>>> "Tests": [
>>> 
>>>  { "Data": [ 7, 42 ],
>>>    "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.compile",
>>>    "Info": {}
>>>  },
>>>  { "Data": [ 13 ],
>>>    "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.exec",
>>>    "Info": {}
>>>  },
>>>  { "Data": [ 11 ],
>>>    "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.exec",
>>>    "Info": {}
>>>  },
>>>  { "Data": [ "49333a87d501b0aea2191830b66b5eec" ],
>>>    "Name": "nts.SingleSource/Benchmarks/Adobe-C++/functionobjects.hash",
>>>    "Info": {}
>>>  }
>>> 
>>> ]
>>> Now:
>>> 
>>> "test": [
>>> 
>>>  {
>>>      "name": "SingleSource/Benchmarks/Adobe-C++/functionobjects",
>>>      "compile_time": [ 7, 42 ],
>>>      "execution_time": [ 13, 11 ],
>>>      "hash": "49333a87d501b0aea2191830b66b5eec"
>>>  }
>>> 
>>> ]
>>> Implements upgrade logic for submissions in the old format
>> 
>> I guess we can encounter both clients submitting the new format to older servers and older clients submitting the old format to newer servers.
>> Out of these combinations, I guess we can only support old format submitted to newer servers, not the other way around, right?
> 
> Yes.
> 
>> If so, it'll be important to make sure the lnt.llvm.org is operating well and all public bots currently submitting data to llvm.org/perf have switched to submitting lnt.llvm.org before this lands as otherwise the bots will start submitting this new format and the very out-of-date LNT server at llvm.org/perf will not be able to cope?
> 
> - We are in luck here: The reporting part haven't changed with this patch so `lnt runtest` still produces the old format. (I planed to change that separately)
> - I hope we have ways to upgrade the llvm.org LNT instance...
> 
>> Although that being said, it seems the llvm.org/perf service is down since last weekend's maintenance works on llvm.org, and maybe we just have to switch submitting to lnt.llvm.org anyway.
>> @cmatthews: Do you think the version of the server at lnt.llvm.org could be upgraded pretty much immediately after this lands to make sure we don't miss performance reports from bots that automatically upgrade to ToT LNT?
> 
> I wasn't even aware there are two public instances. Is this like the greendragon/buildbot differences?
> We are in no hurry to land this; we can probably sync this with upcoming Rest API changes.
> 
>> 
>> 
>>> I will submit a separate patch to remove the "info_key" columns from the LNT database.
>>> I will submit a separate patch to have the runtest tests submit in the new format.
>> 
>> Ah right, here we can leave some time between the server accepting the new format and the clients starting to submit the new format, to allow servers to be upgraded.
> 
> yep.
> 
>> 
>> 
>>> This has been discussed with Chris Matthews to make sure the new REST export APIs will spit out the same format, so users only need to learn one format and it becomes easier to move data around servers.
> 
> 
> Repository:
>  rL LLVM
> 
> https://reviews.llvm.org/D34584
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170627/a98874c5/attachment.html>