[LLVMdev] New -O3 Performance tester - Use hardware to get reliable numbers
Tobias Grosser
tobias at grosser.es
Tue Jan 7 10:06:51 PST 2014
Hi,
I would like to announce a new set of LNT -O3 performance testers.
In a discussion titled "Question about results reliability in LNT
infrustructure" Anton suggested that one way to get statistically
reliable test results from the LNT infrastructure is to use a larger
sample size (5-10) as well as a more robust statistical test
(Wilcoxon/Mann-Whitney). Another requirement to make the performance
results we get from our testers useful is to have a per-commit
performance run.
I would like to announce that I set up 4 identical machines* that
publicly report LNT results for 'clang -O3' at:
http://llvm.org/perf/db_default/v4/nts/machine/34
We currently catch in average groups of 3-5 commits. As most commits
obviously do not impact performance this seems to be enough to track
down performance regressions/changes easily.
The results that have been reported so far seem to provide sufficient
information to catch performance changes. Specifically, when setting the
aggregation function to median, most runs are shown to not impact
performance:
e.g:
http://llvm.org/perf/db_default/v4/nts/19939?num_comparison_runs=10&test_filter=&test_min_value_filter=&aggregation_fn=median&compare_to=19934&submit=Update
We still have a couple of runs that report performance differences, but
where looking at the performance graph of the changed test cases makes
it very clear that those are false positives due to test case noise.
Here comes the point of this mail. I am currently not sure when I find
time to improve the LNT infrastructure to take advantage of the data
provided. So in case someone else would like to have a look and e.g. add
the Wilcoxon/Mann-Whitney test this would be highly appreciated.
I also have a couple of more machines. Hence, if the LNT infrastructure
is in place we can use them to increase the reliability of the results
even more.
Cheers,
Tobias
* Also have sufficiently close performance characteristics when running
LNT tests for the same version
More information about the llvm-dev
mailing list