<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 24 May 2017, at 17:31, David Blaikie <<a href="mailto:dblaikie@gmail.com" class="">dblaikie@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, May 23, 2017 at 10:51 AM Daniel Sanders via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word; line-break: after-white-space;" class=""><div class="">Could you give<span class="Apple-converted-space"> </span><a href="https://reviews.llvm.org/differential/diff/99949/" target="_blank" class="">https://reviews.llvm.org/differential/diff/99949/</a> a try? It brings back the reverted commit and fixes two significant compile-time issues. Assuming it works for you too, I'll finish off the patches and post them individually.</div><div class=""><br class=""></div><div class="">The first one removes the single-use lambdas in the generated code. These turn out to be _really_ expensive. Replacing them with equivalent gotos saves 11 million allocations (~57%) during the course of compiling AArch64InstructionSelector.cpp.o. The cumulative number of bytes allocated also drops by ~4GB (~36%).</div></div></blockquote><div class=""><br class="">(this is outside my wheelhouse, so just as an aside): Could you explain further what aspect of the change was that saved allocations? Lambdas themselves don't allocate memory (std::function of a stateful lambda may allocate memory - but I didn't see any std::function in your change, though I might've missed it), so I'm guessing it's something else/some other aspect of the code in/outside the lambdas and where it moved that changed the allocation pattern?<br class=""></div></div></div></div></blockquote><div><br class=""></div><div>My tools don't tell me which allocations no longer occur but compiling the lambdas requires memory allocations. I believe the allocations come from the various optimization and analysis passes, SelectionDAG, MC, etc. for each of the MachineFunction's corresponding to the lambdas.</div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word; line-break: after-white-space;" class=""><div class="">The second one is to split up the functions by the number of operands in the top-level instruction. This constrains the scale of the task the register allocator needs to deal with in X86InstructionSelection.cpp.o.</div></div><div style="word-wrap: break-word; line-break: after-white-space;" class=""><div class=""><div class=""><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On 22 May 2017, at 10:42, Diana Picus <<a href="mailto:diana.picus@linaro.org" target="_blank" class="">diana.picus@linaro.org</a>> wrote:</div><br class="m_-1398107004193548205Apple-interchange-newline"><div class=""><div class="">Nope, no sanitizers.<br class=""><br class="">On 22 May 2017 at 11:38, Daniel Sanders <<a href="mailto:daniel_l_sanders@apple.com" target="_blank" class="">daniel_l_sanders@apple.com</a>> wrote:<br class=""><blockquote type="cite" class="">Is that with -fsanitize=memory too?<br class=""><br class="">I'm currently building ToT with r303258 reverted. Once that's done I'll commit the revert and start investigating fixes.<br class=""><br class=""><blockquote type="cite" class="">On 22 May 2017, at 10:22, Diana Picus <<a href="mailto:diana.picus@linaro.org" target="_blank" class="">diana.picus@linaro.org</a>> wrote:<br class=""><br class="">Hi Daniel,<br class=""><br class="">I did your experiment on a TK1 machine (same as the bots) and for r303258 I get:<br class="">real 18m28.882s<br class="">user 35m37.091s<br class="">sys 0m44.726s<br class=""><br class="">and for r303259:<br class="">real 50m52.048s<br class="">user 88m25.473s<br class="">sys 0m46.548s<br class=""><br class="">If I can help investigate, please let me know, otherwise we can just<br class="">try your fixes and see how they affect compilation time.<br class=""><br class="">Thanks,<br class="">Diana<br class=""><br class="">On 22 May 2017 at 10:49, Daniel Sanders <<a href="mailto:daniel_l_sanders@apple.com" target="_blank" class="">daniel_l_sanders@apple.com</a>> wrote:<br class=""><blockquote type="cite" class="">r303341 is the re-commit of the r303259 which tripled the number of rules<br class="">that can be imported into GlobalISel from SelectionDAG. A compile time<br class="">regression is to be expected but when I looked into it I found it was ~25s<br class="">on my machine for the whole incremental build rather than the ~12mins you<br class="">are seeing. I'll take another look.<br class=""><br class="">I'm aware of a couple easy improvements we could make to the way the<br class="">importer works. I was leaving them until we change it over to a state<br class="">machine but the most obvious is to group rules by their top-level gMIR<br class="">instruction. This would reduce the cost of the std::sort that handles the<br class="">rule priorities in generating the source file and will also make it simpler<br class="">for the compiler to compile it.<br class=""><br class=""><br class="">On 21 May 2017, at 11:16, Vitaly Buka <<a href="mailto:vitalybuka@google.com" target="_blank" class="">vitalybuka@google.com</a>> wrote:<br class=""><br class="">It must be r303341, I commented on corresponding llvm-commits thread.<br class=""><br class="">On Fri, May 19, 2017 at 7:34 AM, Diana Picus via llvm-dev<br class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><blockquote type="cite" class=""><br class="">Ok, thanks. I'll try to do a bisect next week to see if I can find it.<br class=""><br class="">Cheers,<br class="">Diana<br class=""><br class="">On 19 May 2017 at 16:29, Daniel Sanders <<a href="mailto:daniel_l_sanders@apple.com" target="_blank" class="">daniel_l_sanders@apple.com</a>><br class="">wrote:<br class=""><blockquote type="cite" class=""><br class=""><blockquote type="cite" class="">On 19 May 2017, at 14:54, Daniel Sanders via llvm-dev<br class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">r303259 will have increased compile-time since it tripled the number of<br class="">importable<br class="">SelectionDAG rules but a quick measurement building the affected file:<br class=""> ninja<br class="">lib/Target/<Target>/CMakeFiles/LLVM<Target>CodeGen.dir/<Target>InstructionSelector.cpp.o<br class="">for both ARM and AArch64 didn't show a significant increase. I'll check<br class="">whether<br class="">it made a different to linking.<br class=""></blockquote><br class="">I don't think it's r303259. Starting with a fully built r303259, then<br class="">updating to r303258 and running 'ninja' gives me:<br class=""> real 2m28.273s<br class=""> user 13m23.171s<br class=""> sys 0m47.725s<br class="">then updating to r303259 and running 'ninja' again gives me:<br class=""> real 2m19.052s<br class=""> user 13m38.802s<br class=""> sys 0m44.551s<br class=""><br class=""><blockquote type="cite" class="">sanitizer-x86_64-linux-fast also timed out after one of my commits this<br class="">morning.<br class=""><br class=""><blockquote type="cite" class="">On 19 May 2017, at 14:14, Diana Picus <<a href="mailto:diana.picus@linaro.org" target="_blank" class="">diana.picus@linaro.org</a>> wrote:<br class=""><br class="">Hi,<br class=""><br class="">We've noticed that recently some of our bots (mostly<br class="">clang-cmake-armv7-a15 and clang-cmake-thumbv7-a15) started timing out<br class="">whenever someone commits a change to TableGen:<br class=""><br class="">r303418:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7268" target="_blank" class="">http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7268</a><br class="">r303346:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7242" target="_blank" class="">http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7242</a><br class="">r303341:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7239" target="_blank" class="">http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7239</a><br class="">r303259:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7198" target="_blank" class="">http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7198</a><br class=""><br class="">TableGen changes before that (I checked about 3-4 of them) don't have<br class="">this problem:<br class="">r303253:<br class=""><a href="http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7197" target="_blank" class="">http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7197</a><br class=""><br class="">That one in particular actually finishes the whole build in 635s,<br class="">which is only a bit over 50% of the timeout limit (1200s). So, between<br class="">r303253 and now, something happened that made full builds<br class="">significantly slower. Does anyone have any idea what that might have<br class="">been? Also, has anyone noticed this on other bots?<br class=""><br class="">Thanks,<br class="">Diana<br class=""></blockquote><br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""></blockquote><br class=""></blockquote>_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""></blockquote><br class=""><br class=""><br class=""></blockquote></blockquote><br class=""></blockquote></div></div></blockquote></div><br class=""></div></div></div></div>_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote></div></div></div></blockquote></div><br class=""></body></html>