[PATCH] D37421: [XRay] [test-suite] Add LNT support to retref-bench benchmarks.

Tue Sep 19 19:53:06 PDT 2017

hfinkel added a comment.

In https://reviews.llvm.org/D37421#875900, @MatzeB wrote:

> In https://reviews.llvm.org/D37421#875879, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D37421#874416, @MatzeB wrote:
> >
> > > First: This looks ok and can be committed as is if you want.
> > >
> > > However IMO using the LNTBased system here is not the best choice, I'd rather try writing a `litsupport` plugin. LNTBased is little used and IMO less flexible than the cmake+lit plugin approach.
> >
> >
> > I agree, having this in litsupport seems better (and also solves the duplication/symlink problem). I imagine that the plugin could append a function to metric_collectors that reads the CSV and reports the associated timing data. @MatzeB, is there documentation on the litsupport plugin interface anywhere? (if all we have is what's in the README.md, we should write some more).
>
>
> That's all right now. I'll write some paragraphs on how to get started in the README file and make sure the interfaces have better documentation comments.
>
> In played around with this yesterday and this should work to collect the gtest output:
>
> litsupport/microbenchmark.py
>
>   '''Test module to collect google benchmark results.'''
>   from litsupport import shellcommand
>   from litsupport import testplan
>   import csv
>   import lit.Test
>  
>  
>   def _mutateCommandLine(context, commandline):
>       cmd = shellcommand.parse(commandline)
>       cmd.arguments.append("--benchmark_format=csv")
>       # We need stdout outself to get the benchmark csv data.
>       if cmd.stdout is not None:
>           raise Exception("Rerouting stdout not allowed for microbenchmarks")
>       benchfile = context.tmpBase + '.bench.csv'
>       cmd.stdout = benchfile
>       context.microbenchfiles.append(benchfile)
>  
>       return cmd.toCommandline()
>  
>  
>   def _mutateScript(context, script):
>       return testplan.mutateScript(context, script, _mutateCommandLine)
>  
>  
>   def _collectMicrobenchmarkTime(context, microbenchfiles):
>       result = 0.0
>       for f in microbenchfiles:
>           with open(f) as inp:
>               lines = csv.reader(inp)
>               # First line: "name,iterations,real_time,cpu_time,time_unit..."
>               for line in lines:
>                   if line[0] == 'name':
>                       continue
>                   # Note that we cannot create new tests here, so for now we just
>                   # add up all the numbers here.
>                   result += float(line[3])
>       return {'exec_time': lit.Test.toMetricValue(result/1000000.)}
>  
>  
>   def mutatePlan(context, plan):
>       context.microbenchfiles = []
>       plan.runscript = _mutateScript(context, plan.runscript)
>       plan.metric_collectors.append(
>           lambda context: _collectMicrobenchmarkTime(context,
>                                                      context.microbenchfiles)
>       )
>
>
> I found two problems with this at the moment:
>
> - The list of litsupport plugins is currently global. Thus enabling this microbenchmarking plugin will make the non-microbenchmarking tests fail. It should be easy enough to change this to a local setting so that `lit.local.cfg` files can override the plugin list (just have to convince myself that this is a good feature to have :).

What's the design space here? Would it be better to be able to name plugins in the .test file?

> - As we discussed before: lit currently needs to have 1 file to represent each test. This means we cannot produce results for 10 "benchmarks" inside a gtest executable unless we also produce 10 .test files for it.
> 
>   Fixing the 2nd point is probably harder; on the other hand the workaround of simply creating a .test file for each benchmark is not too bad (the current XRay cmake already does exactly this I think).

Do we need to fix this, or we can we have the different benchmarks within the executable simply produce lots of different metrics for the same test? It seems like we need the ability to track multiple metrics per test anyway. Maybe this implies updates to LNT so that the interface does something reasonable?

https://reviews.llvm.org/D37421