<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 28, 2017, at 12:53 PM, Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" class="">mehdi.amini@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class=""><br class="Apple-interchange-newline">On Feb 27, 2017, at 11:42 AM, Matthias Braun via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">In addition to all the good points given in this thread:<br class=""><br class="">- Nowadays I'd recommend using 'lnt runtest test-suite' instead of 'nt' to use the cmake/lit based variant.<br class="">- Alternatively if you just need an A/B comparison run the benchmarks directly as described in<span class="Apple-converted-space"> </span><a href="http://www.llvm.org/docs/TestSuiteMakefileGuide.html#running-the-test-suite-via-cmake" class="">http://www.llvm.org/docs/TestSuiteMakefileGuide.html#running-the-test-suite-via-cmake</a><span class="Apple-converted-space"> </span>and use test-suite/utils/compare.py<br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I’m interested if you can get multi-sample runs with this?</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div>You can just run multiple times and use the `compare.py A_run0 A_run1 A_run2 vs B_run0 B_run1 B_run2` syntax (see also my little introduction in <a href="http://lists.llvm.org/pipermail/llvm-dev/2016-October/105739.html" class="">http://lists.llvm.org/pipermail/llvm-dev/2016-October/105739.html</a>)</div><div><br class=""></div><div>- Matthias</div><br class=""><blockquote type="cite" class=""><div class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">—<span class="Apple-converted-space"> </span></span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Mehdi</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">- Use --benchmarking-only (lnt) / -DTEST_SUITE_BENCHMARKING_ONLY (cmake) to remove a number of tests that are useless for performance testing (like all the unittests in there)<br class="">- I created a blacklist of benchmarks that are noisy for my target by rerunning the test-suite a few times with the same compiler. I can feed this blacklist to `utils/compare.py --filter-blacklist`<br class="">- As we are on the topic. I recommend this talk from last years dev meeting to dampen the expectation that every good compiler transformations must lead to better (or at least neutral) performance: <a href="https://www.youtube.com/watch?v=IX16gcX4vDQ&t=24s" class="">https://www.youtube.com/watch?v=IX16gcX4vDQ&t=24s</a> I think one lesson we should draw from this is that we can use benchmarking as an indicator for problems but there is no way around checking the assembly differences manually for the things where we measured different performance.<br class=""><br class="">- Matthias<br class=""><br class=""><blockquote type="cite" class="">On Feb 27, 2017, at 12:46 AM, Mikael Holmén via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">Hi,<br class=""><br class="">I'm trying to run the benchmark suite:<br class=""><a href="http://llvm.org/docs/TestingGuide.html#test-suite-quickstart" class="">http://llvm.org/docs/TestingGuide.html#test-suite-quickstart</a><br class=""><br class="">I'm doing it the lnt way, as described at:<br class="">http://llvm.org/docs/lnt/quickstart.html<br class=""><br class="">I don't know what to expect but the results seems to be quite noisy and unstable. E.g I've done two runs on two different commits that only differ by a space in CODE_OWNERS.txt on my 12 core ubuntu 14.04 machine with:<br class=""><br class="">lnt runtest nt --sandbox SANDBOX --cc <path-to-my-clang> --test-suite /data/repo/test-suite -j 8<br class=""><br class="">And then I get the following top execution time regressions:<br class="">http://i.imgur.com/sv1xzlK.png<br class=""><br class="">The numbers bounce around a lot if I do more runs.<br class=""><br class="">Given the amount of noise I see here I don't know to sort out significant regressions if I actually do a real change in the compiler.<br class=""><br class="">Are the above results expected?<br class=""><br class="">How to use this?<br class=""><br class=""><br class="">As a bonus question, if I instead run the benchmarks with an added -m32:<br class="">lnt runtest nt --sandbox SANDBOX --cflag=-m32 --cc <path-to-my-clang> --test-suite /data/repo/test-suite -j 8<br class=""><br class="">I get three failures:<br class=""><br class="">--- Tested: 2465 tests --<br class="">FAIL: MultiSource/Applications/ClamAV/clamscan.compile_time (1 of 2465)<br class="">FAIL: MultiSource/Applications/ClamAV/clamscan.execution_time (494 of 2465)<br class="">FAIL: MultiSource/Benchmarks/DOE-ProxyApps-C/XSBench/XSBench.execution_time (495 of 2465)<br class=""><br class="">Is this known/expected or do I do something stupid?<br class=""><br class="">Thanks,<br class="">Mikael<br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class="">llvm-dev@lists.llvm.org<br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<br class=""></blockquote><br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote></div></blockquote></div><br class=""></body></html>