[PATCH] D32272: [XRay] Add Google Benchmark library + initial XRay benchmarks

Fri Jun 23 13:10:51 PDT 2017

> On Jun 23, 2017, at 12:38 PM, Hal Finkel via Phabricator via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> hfinkel added a comment.
> 
> In https://reviews.llvm.org/D32272#789237, @MatzeB wrote:
> 
>> In https://reviews.llvm.org/D32272#789224, @hfinkel wrote:
>> 
>>> In https://reviews.llvm.org/D32272#789201, @MatzeB wrote:
>>> 
>>>> In https://reviews.llvm.org/D32272#789021, @hfinkel wrote:
>>>> 
>>>>> In https://reviews.llvm.org/D32272#789012, @kristof.beyls wrote:
>>>>> 
>>>>>> In https://reviews.llvm.org/D32272#789006, @hfinkel wrote:
>>>>>> 
>>>>>>> In https://reviews.llvm.org/D32272#788932, @dberris wrote:
>>>>>>> 
>>>>>>>> LGTM for the most part, thanks for making the changes, it's much simpler now.
>>>>>>>> 
>>>>>>>> Thoughts @MatzeB? I think we can work on running/reporting the microbenchmarks data later, in changes to LNT.
>>>>>>> 
>>>>>>> 
>>>>>>> Can you please summarize the state of this and the eventual goal? It looks like:
>>>>>>> 
>>>>>>> 1. Eventually, we need a new LNT plugin (i.e. something like what we have in LNTBased/*) that understands how to interpret the output of the Google benchmark library.
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> I hope this could all be done within the test-suite, probably extending test-suite/litsupport. I hope/think that no changes to LNT will be needed, but I haven't investigated myself.
>>>>> 
>>>>> Just to be clear, LNTBased/* is in the test suite; LNT supports plugins and those can be in the test suite. That having been said, if we can do this just within litsupport, that potentially seems better. The key feature seems to be being able to report multiple benchmark timings per actual executable.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> - The lit data format currently does not allow multiple values for a metric.
>>>> - I'm not sure I see why that would be needed here.
>>> 
>>> 
>>> I don't mean multiple times for each benchmark, I mean that there are multiple benchmarks for which we should individually collect timings. In the benchmark added here, for example, we'll get individual timings out for:
>>> 
>>>  BENCHMARK(BM_ReturnInstrumentedu);
>>>  BENCHMARK(BM_ReturnInstrumentedPatchedThenUnpatched);
>>>  BENCHMARK(BM_ReturnInstrumentedPatched);
>>>  ...
>>> 
>>> and we should separately record them all.
>> 
>> 
>> Ah right.
>> 
>> - Todays lit works best if you have 1 file for each test; Also test discover is done before any test is run.
>> - If there is a flag to only run a specific benchmark in a googletest executable then we could work around this limitation on the cmake side:
>> 
>>  Just generate multiple .test files with slightly different RUN: lines (assume that option is called --only-test): ``` retref-bench-unpatched.test  contains RUN: retref-bench --only-test unpatched retref-bench-patched.test contains RUN: retref-bench --only-test patches ... ``` It would require us to repeat the sub-benchmark names in the cmake file (or write cmake logic to scan the sourcecode?)
> 
> 
> I suppose we could do this, but realistically, we need to be able to collect multiple named timings per test. We need some way of telling lit, "don't use the timing from this executable, it provides timing numbers some other way", and then some mechanism from parsing the output to collect those numbers. It seems fairly straightforward to do this as an LNT plugin because the benchmark library has its own well-defined output format, which is why I mentioned that, but could we build something into lit as well.

- Outputtting custom metric is easy in lit as well. It already spits out a lot of custom metrics. Example:

$ lit SingleSource/Benchmarks/Misc/pi.test
...
compile_time: 0.0221
exec_time: 0.4003
hash: "549fc0d04911386004eaafbf6a36ba0d"
link_time: 0.0329
size: 8512
size.__const: 40
size.__cstring: 97
size.__la_symbol_ptr: 16
size.__nl_symbol_ptr: 16
size.__stub_helper: 36
size.__stubs: 12
size.__text: 385
size.__unwind_info: 80

- While spitting out "xxx.time", "yyy.time", "zzz.time" for the subbenchmarks would be easy to implement, I don't think it is a good idea, I'd rather see each sub-benchmark showing up as an independent benchmark in the resulting data to better match how we analyze the data later.
- LNT has somewhat fixed database schemas that need to be extended when introducing new metrics. Yes LNT has support for that, but it is somewhat cryptically hidden in the 'lnt/server/db/migrations' directory and forces you to provide code to upgrade databases that were created before your schema changes. Which in practice is a higher bar than changing lit/LNTBased.
- It may be a good idea to extend lit with a notion of sub-benchmarks, so that the lit json output would indeed list refreg-bench:patched, refreg-branch:unpatched... as independent tests. That would be my preferred solution, doing it on the cmake side as outlined above would just be easier to pull off today.

- Matthias