<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 3, 2017, at 5:08 AM, Diana Picus <<a href="mailto:diana.picus@linaro.org" class="">diana.picus@linaro.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">On 12 June 2017 at 18:54, Diana Picus <<a href="mailto:diana.picus@linaro.org" class="">diana.picus@linaro.org</a>> wrote:<br class=""><blockquote type="cite" class=""><br class="">Hi all,<br class=""><br class="">I added a buildbot [1] running the test-suite with -O0 -global-isel. It runs into the same 2 timeouts that I reported previously on this thread (paq8p and scimark2). It would be nice to make it green before flipping the switch.<br class=""><br class="">At the moment, it lives in an internal buildmaster that I've setup for this purpose. If we fix it and it proves to be stable for a week or two, I'll move it to the public master.<br class=""><br class=""></blockquote><br class="">FYI, this is now live on the public master:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-aarch64-global-isel" class="">http://lab.llvm.org:8011/builders/clang-cmake-aarch64-global-isel</a><br class=""></div></div></blockquote><div><br class=""></div>Sweet!</div><div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class="">I hope people will find it useful.<br class=""></div></div></blockquote><div><br class=""></div><div><br class=""></div><div>Thanks for doing this.</div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><blockquote type="cite" class=""><br class="">Cheers,<br class="">Diana<br class=""><br class="">[1] <a href="http://master2.llvm.validation.linaro.org/builders/clang-cmake-aarch64-global-isel" class="">http://master2.llvm.validation.linaro.org/builders/clang-cmake-aarch64-global-isel</a><br class=""><br class=""><br class="">On 6 June 2017 at 19:11, Quentin Colombet <<a href="mailto:qcolombet@apple.com" class="">qcolombet@apple.com</a>> wrote:<br class=""><blockquote type="cite" class=""><br class="">Thanks Kristof.<br class=""><br class="">Sounds like we'll need to investigate though I'd say it is not blocking the switch.<br class=""><br class="">At this point I think everybody is on board to flip the switch.<br class="">@Eric, how does that sound to you?<br class=""><br class="">Thanks,<br class="">Q<br class=""><br class="">Le 1 juin 2017 à 07:46, Kristof Beyls <<a href="mailto:Kristof.Beyls@arm.com" class="">Kristof.Beyls@arm.com</a>> a écrit :<br class=""><br class=""><br class="">On 31 May 2017, at 17:07, Quentin Colombet <<a href="mailto:qcolombet@apple.com" class="">qcolombet@apple.com</a>> wrote:<br class=""><br class=""><br class="">Latest comparisons on my side, after picking up r304244, i.e. the correct Localizer pass.<br class="">* CTMark compile time, comparing "-O0 -g" vs '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0': about 6% increase with globalisel. This was about 3.5% before the Localizer pass landed.<br class=""><br class=""><br class="">That one is surprising too. I wouldn’t have expected this pass to show up in the compile time profile. At least not to this extend.<br class="">What is the biggest offender?<br class=""><br class=""><br class="">Hmmm. So I took the 3.5% compile time overhead from my last measurement before the localizer landed, from around 24th of May.<br class="">When using -ftime-report, I see the Localizer pass typically taking very roughly about 1% of compile time.<br class="">Maybe another part of GlobalISel became a bit slower since I did that 3.5% measurement?<br class="">Or maybe the Localizer pass changes the structure of the program so that another later pass gets a different compile time profile?<br class="">Basically, I'd have to do more experiments to figure that one out.<br class=""><br class="">As far as where time is spent in the gisel-passes itself, on average, I saw the following on the latest CTMark experiment I ran:<br class="">Avg compile time spent in IRTranslator: 4.61%<br class="">Avg compile time spent in InstructionSelect: 7.51%<br class="">Avg compile time spent in Legalizer: 1.06%<br class="">Avg compile time spent in Localizer: 0.76%<br class="">Avg compile time spent in RegBankSelect: 2.12%<br class=""><br class=""><br class="">* My usual performance benchmarking run: 8.5% slow-down. This was about 9.5% before the Localizer pass landed, so a slight improvement.<br class="">* Code size: 3.14% larger. This was about 2.8% before the Localizer pass landed, so a slight regression.<br class=""><br class=""><br class="">That one is surprising. Do you have an idea of what is happening?<br class="">Alternatively if you can point me to the biggest offender, I can have a look.<br class=""><br class=""><br class="">So the biggest offenders on the mem_bytes metric in LNT are:<br class="">O0 -g O0 -g gisel-with-localizer O0 -g gisel-without-localizer<br class="">SingleSource/Benchmarks/Misc/perlin 14272 14640 18344 25.95%<br class="">SingleSource/Benchmarks/Dhrystone/dry 16560 17144 20160 18.21%<br class="">SingleSource/Benchmarks/Stanford/QueensProfile 13912 14192 15136 6.79%<br class="">MultiSource/Benchmarks/Trimaran/netbench-url/netbench-url 71400 72272 75504 4.53%<br class=""><br class="">I haven't had time to investigate what exact changes make the code size go up that much with the localizer pass in those cases...<br class=""><br class=""><br class="">The only thing I can think of is that we duplicate constants that are expensive to materialize. If that’s the case, we were discussing with Ahmed an alternative to the localizer pass that would operate during InstructionSelect so may be worth pursuing.<br class=""><br class=""><br class=""></blockquote><br class=""></blockquote></div></div></blockquote></div><br class=""></body></html>