[llvm-dev] Questions About LLVM Test Suite: Time Units, Re-running benchmarks

Mon Jul 19 07:36:21 PDT 2021

On Sun, Jul 18, 2021 at 8:58 PM Michael Kruse via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Am So., 18. Juli 2021 um 11:14 Uhr schrieb Stefanos Baziotis via
> llvm-dev <llvm-dev at lists.llvm.org>:
> > Now, to the questions. First, there doesn't seem to be a common time
> unit for
> > "exec_time" among the different tests. For instance, SingleSource/ seem
> to use
> > seconds while MicroBenchmarks seem to use μs. So, we can't reliably judge
> > changes. Although I get the fact that micro-benchmarks are different in
> nature
> > than Single/MultiSource benchmarks, so maybe one should focus only on
> > the one or the other depending on what they're interested in.
>
> Usually one does not compare executions of the entire test-suite, but
> look for which programs have regressed. In this scenario only relative
> changes between programs matter, so μs are only compared to μs and
> seconds only compared to seconds.
>
>
> > In any case, it would at least be great if the JSON data contained the
> time unit per test,
> > but that is not happening either.
>
> What do you mean? Don't you get the exec_time per program?
>
>
> > Do you think that the lack of time unit info is a problem ? If yes, do
> you like the
> > solution of adding the time unit in the JSON or do you want to propose
> an alternative?
>
> You could also normalize the time unit that is emitted to JSON to s or ms.
>
> >
> > The second question has to do with re-running the benchmarks: I do
> > cmake + make + llvm-lit -v -j 1 -o out.json .
> > but if I try to do the latter another time, it just does/shows nothing.
> Is there any reason
> > that the benchmarks can't be run a second time? Could I somehow run it a
> second time ?
>
> Running the programs a second time did work for me in the past.
> Remember to change the output to another file or the previous .json
> will be overwritten.
>
>
> > Lastly, slightly off-topic but while we're on the subject of
> benchmarking,
> > do you think it's reliable to run with -j <number of cores> ? I'm a
> little bit afraid of
> > the shared caches (because misses should be counted in the CPU time,
> which
> > is what is measured in "exec_time" AFAIU)
> > and any potential multi-threading that the tests may use.
>
> It depends. You can run in parallel, but then you should increase the
> number of samples (executions) appropriately to counter the increased
> noise. Depending on how many cores your system has, it might not be
> worth it, but instead try to make the system as deterministic as
> possible (single thread, thread affinity, avoid background processes,
> use perf instead of timeit, avoid context switches etc. ). To avoid
> systematic bias because always the same cache-sensitive programs run
> in parallel, use the --shuffle option.
>
> Also, depending on what you are trying to achieve (and what your platform
target is), you could enable perfcounter
<https://github.com/google/benchmark/blob/main/docs/perf_counters.md>collection;
if instruction counts are sufficient (for example), the value will probably
not vary much with  multi-threading.

...but it's probably best to avoid system noise altogether. On Intel, afaik
that includes disabling turbo boost and hyperthreading, along with
Michael's recommendations.

Michael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210719/1eeae8fa/attachment.html>