<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Kristof,<div class=""><br class=""></div><div class="">Thanks for the updated numbers.</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On May 31, 2017, at 7:45 AM, Kristof Beyls <<a href="mailto:kristof.beyls@arm.com" class="">kristof.beyls@arm.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><div class=""><br class="Apple-interchange-newline">On 31 May 2017, at 15:33, Diana Picus via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Cool test :)<div class="">It seems to work fine now, I don't see any new failures. IIUC, Kristof is also giving it another run.</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">Diana</div></div></div></blockquote><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Latest comparisons on my side, after picking up r304244, i.e. the correct Localizer pass.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">* CTMark compile time, comparing "-O0 -g" vs '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0': about 6% increase with globalisel. This was about 3.5% before the Localizer pass landed.</div></div></blockquote><div><br class=""></div><div>That one is surprising too. I wouldn’t have expected this pass to show up in the compile time profile. At least not to this extend.</div><div>What is the biggest offender?</div><br class=""><blockquote type="cite" class=""><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">* My usual performance benchmarking run: 8.5% slow-down. This was about 9.5% before the Localizer pass landed, so a slight improvement.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">* Code size: 3.14% larger. This was about 2.8% before the Localizer pass landed, so a slight regression.</div></div></blockquote><div><br class=""></div><div>That one is surprising. Do you have an idea of what is happening?</div><div>Alternatively if you can point me to the biggest offender, I can have a look.</div><div><br class=""></div><div>The only thing I can think of is that we duplicate constants that are expensive to materialize. If that’s the case, we were discussing with Ahmed an alternative to the localizer pass that would operate during InstructionSelect so may be worth pursuing.</div><br class=""><blockquote type="cite" class=""><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">* Debug info quality: I didn't do another recheck, trusting that the Localizer pass wouldn't change debug info quality.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">* Stack size usage: I don't know of a good way to measure this, but Diana's experiments show that at least for bootstrapping it went from "problematically bad" to "OK".</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Thanks,</div></div></blockquote><div><br class=""></div>Thanks,</div><div>-Quentin<br class=""><blockquote type="cite" class=""><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Kristof</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On 30 May 2017 at 22:57, Quentin Colombet<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="" style="word-wrap: break-word;">Hi Diana,<div class=""><br class=""></div><div class="">I’ve actually gone ahead and pushed the fix as I was able to produce a small reproducer.</div><div class=""><br class=""></div><div class="">This is <span class="" style="font-family: Menlo; font-size: 11px; background-color: rgb(255, 255, 255);">r304244</span><div class=""><br class=""></div><div class="">Let me know if you encounter any other problem.</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">-Quentin<br class=""><blockquote type="cite" class=""><div class=""><div class="h5"><div class="">On May 30, 2017, at 7:42 AM, Quentin Colombet via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="m_1695695160749863078Apple-interchange-newline"></div></div><div class=""><div class=""><div class=""><div class="h5">Thanks Diana.<br class=""><br class="">That is indeed the assumption in the code and this is obviously wrong.<br class=""><br class="">Could you try the attached patch?<br class=""><br class="">(I haven’t even tried to compile it though)<br class=""><br class="">Cheers,<br class="">-Quentin<br class=""></div></div><span id="m_1695695160749863078cid:1F6F0F97-86C9-415E-9F99-FC6E97F24E79@apple.com" class=""><localizer_tentative_fix.diff></span><br class=""><blockquote type="cite" class=""><div class=""><div class="h5">On May 30, 2017, at 6:56 AM, Diana Picus <<a href="mailto:diana.picus@linaro.org" target="_blank" class="">diana.picus@linaro.org</a>> wrote:<br class=""><br class="">Hi Quentin,<br class=""><br class="">I've attached a reproducer for the problem.<br class=""><br class="">I've described what I think the problem is in the file, but the short<br class="">version is that the localizer shouldn't assume that the iteration<br class="">order for the uses corresponds to the logical order of instructions in<br class="">a basic block (we're now localizing before the first use that we find,<br class="">but that may be later in the basic block, so we'd end up with uses<br class="">before the def).<br class=""><br class="">I'm not sure it's possible to test this without running a couple of<br class="">passes. You might be able to trigger it only with reg bank select +<br class="">localize, but I haven't tried. Using only the localizer would mean<br class="">that the iteration order for the uses would be the order in which<br class="">they're read in, so you wouldn't have this problem.<br class=""><br class="">Hope that helps,<br class="">Diana<br class=""><br class=""><br class="">On 29 May 2017 at 10:06, Diana Picus <<a href="mailto:diana.picus@linaro.org" target="_blank" class="">diana.picus@linaro.org</a>> wrote:<br class=""><blockquote type="cite" class="">Thanks Quentin, it's in progress now, I'll let you know how it goes.<br class=""><br class="">Cheers,<br class="">Diana<br class=""><br class="">On 27 May 2017 at 03:36, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>> wrote:<br class=""><blockquote type="cite" class="">Hi Kristof,<br class=""><br class="">I’ve pushed the localizer in r304051 and added it in the AArch64 O0 pipeline<br class="">in r304052.<br class=""><br class="">I let Diana investigate the seg fault she was seeing.<br class=""><br class="">@Diana, let me know if you need help.<br class=""><br class="">Cheers,<br class="">-Quentin<br class=""><br class="">On May 25, 2017, at 1:53 PM, Quentin Colombet via llvm-dev<br class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">Hi Kristof,<br class=""><br class="">On May 25, 2017, at 2:09 AM, Kristof Beyls <<a href="mailto:kristof.beyls@arm.com" target="_blank" class="">kristof.beyls@arm.com</a>> wrote:<br class=""><br class=""><br class="">On 24 May 2017, at 22:01, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>> wrote:<br class=""><br class="">Hi Kristof,<br class=""><br class="">Thanks for going back so fast!<br class=""><br class="">On May 24, 2017, at 12:57 PM, Kristof Beyls <<a href="mailto:kristof.beyls@arm.com" target="_blank" class="">kristof.beyls@arm.com</a>> wrote:<br class=""><br class=""><br class="">On 24 May 2017, at 19:31, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>> wrote:<br class=""><br class="">Hi Kristof,<br class=""><br class="">Thanks for the measurements.<br class=""><br class="">On May 24, 2017, at 6:00 AM, Kristof Beyls <<a href="mailto:kristof.beyls@arm.com" target="_blank" class="">kristof.beyls@arm.com</a>> wrote:<br class=""><br class=""><br class="">- Comparing against -O0 without globalisel but with the above regalloc<br class="">options: 5.6% performance drop, 1% code size drop.<br class=""><br class="">In summary, the measurements indicate some good improvements.<br class="">I also haven't measure the impact on compile time.<br class=""><br class=""><br class="">Do you have a mean to make this measurement?<br class="">Ahmed did a bunch of compile time measurements on our side and I wanted to<br class="">see if I need to put him on the hook again :).<br class=""><br class=""><br class="">I did a quick setup with CTMark (part of the test-suite). I ran each of<br class="">* '-O0 -g',<br class="">* '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0', and<br class="">* '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mllvm<br class="">-optimize-regalloc -mllvm -regalloc=greedy'<br class="">5 times, cross-compiling from X86 to AArch64, and took the median measured<br class="">compile times.<br class="">In summary, I see GlobalISel having a compile time that's 3.5% higher than<br class="">the current -O0 default.<br class="">With enabling the greedy register allocator, this increases to 28%.<br class="">28% is probably too high?<br class=""><br class=""><br class="">I think it is yes.<br class="">I have attached a quick hack to the greedy allocator to feature a fast mode.<br class="">Could you give it a try?<br class=""><br class="">To enable the fast mode, please use (-mllvm) -regalloc-greedy-fast=true<br class="">(default is false).<br class=""><br class=""><br class="">I'm afraid it doesn't seem to save much compile time. On geomean, I see<br class="">about 26% compile time increase against the current -O0 default (compared to<br class="">28% increase for regalloc greedy without your patch).<br class=""><br class=""><br class="">Interesting, I guess a lot of time is spent in the coalescer. Could you give<br class="">a try with -join-liveintervals=false?<br class=""><br class=""><br class="">With adding -join-liveintervals=false, I see the compile time increase going<br class="">up to 28% again.<br class=""><br class=""><br class="">Heh, I am mildly surprised we hand much more live-ranges to the allocator<br class="">when we do that.<br class=""><br class=""><br class=""><br class="">Do you know where the time is spent (-time-passes)?<br class=""><br class=""><br class="">I'm afraid I won't have time to have a closer look in the next couple of<br class="">days - I don't know where the time is spent at the moment.<br class=""><br class=""><br class="">Fair enough, will investigate later.<br class=""><br class=""><br class=""><br class="">Anyhow, fixing all of those, although this is I think the right approach,<br class="">will take time, so we can go with the localizer.<br class=""><br class=""><br class="">Right, I don't understand the register allocator well enough to know if that<br class="">compile time overhead can be fixed, while still getting the necessary<br class="">codegen benefits the greedy allocator gives.<br class="">Is there any specific help you're looking for with getting the localizer<br class="">work well enough for production use?<br class=""><br class=""><br class="">I’ll clean-up the WIP patch for the localizer, then you guys can fix the bug<br class="">that you found.<br class=""><br class="">I’ll do that tomorrow.<br class=""><br class="">Cheers,<br class="">-Quentin<br class=""><br class=""><br class="">Thanks,<br class=""><br class="">Kristof<br class=""><br class=""><br class="">______________________________<wbr class="">_________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/llvm-dev</a><br class=""><br class=""><br class=""></blockquote></blockquote></div></div><localizer-mo-order.mir><br class=""></blockquote><span class=""><br class="">______________________________<wbr class="">_________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/llvm-dev</a><br class=""></span></div></div></blockquote></div><br class=""></div></div></blockquote></div><br class=""></div>_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></blockquote></div></blockquote></div><br class=""></div></body></html>