<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 16, 2017, at 4:41 PM, Xinliang David Li via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><br class="Apple-interchange-newline"><br class=""><div class="gmail_quote">On Thu, Feb 16, 2017 at 3:45 PM, Chandler Carruth via llvm-dev<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr" class="">First off, I just want to say wow and thank you. This kind of data is amazing. =D<br class=""><br class=""><div class="gmail_quote"><span class=""><div dir="ltr" class="">On Thu, Feb 16, 2017 at 2:46 AM Kristof Beyls <<a href="mailto:Kristof.Beyls@arm.com" target="_blank" class="">Kristof.Beyls@arm.com</a>> wrote:</div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">The biggest relative code size increases indeed didn't happen for the biggest programs, but instead for a few programs weighing in at about 100KB.</div><div class="m_-5579311497984940082gmail_msg">I'm assuming the Google benchmark set covers much bigger programs than the ones displayed here.</div><div class="m_-5579311497984940082gmail_msg">FWIW, the cluster of programs where code size increases between 60% to 80% with a size of about 100KB, all come from MultiSource/Benchmarks/<wbr class="">TSVC. Interestingly, these programs seem to have float and double variants,  e.g. (MultiSource/Benchmarks/TSVC/<wbr class="">Searching-flt/Searching-flt and MultiSource/Benchmarks/TSVC/<wbr class="">Searching-dbl/Searching-dbl), and the code size bloat only happens for the double variants.</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">I think we should definitely look at this (as it seems likely to be a bug somewhere), but I'm also not overly concerned with size regressions in the TSVC benchmarks which are unusually loop heavy and small. We've have several other changes that caused big fluctuations here.</div><span class=""><div class=""> </div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">I think it may still be worthwhile to check if this also happens on other architectures, and why it happens only for the double-variants, not the float-variants.</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">+1</div><span class=""><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">The second chart shows relative code size increase (vertical axis) vs relative performance improvement (horizontal axis):<br class=""></div><div class="m_-5579311497984940082gmail_msg">I manually checked the cause of the 3 biggest performance regressions (proprietary benchmark1: -13.70%; MultiSource/Applications/<wbr class="">hexxagon/hexxagon: -10.10%; MultiSource/Benchmarks/<wbr class="">FreeBench/fourinarow/<wbr class="">fourinarow<span class="m_-5579311497984940082m_-1689292437810271661Apple-tab-span m_-5579311497984940082gmail_msg" style="white-space: pre-wrap;">

</span>-5.23%).</div><div class="m_-5579311497984940082gmail_msg">For the proprietary benchmark and hexxagon, the code generation didn't change for the hottest parts, so probably is caused by micro-architectural effects of code layout changes.</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">This is always good to know, even though it is frustrating. =]</div><span class=""><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">For fourinarow, there seemed to be a lot more spill/fill code, so probably due to non-optimality of register allocation.</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">This is something we should probably look at. If you have the output lying around, maybe file a PR about it?</div><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><span class=""><div class="m_-5579311497984940082gmail_msg">The third chart below just zooms in on the above chart to the -5% to 5% performance improvement range:<br class=""></div></span><div class="m_-5579311497984940082gmail_msg"><span id="cid:15a4855865b9015c7b73"><unroll_codesize_vs_performance_zoom.png></span></div><span class=""><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg">Whether to enable the increase in unroll threshold only at O3 or also at O2: I don't have a strong opinion based on the above data.</div></span></div></div></blockquote><div class=""><br class=""></div><div class="">FWIW, this data seems to clearly indicate that we don't get performance wins with any consistency when the code size goes up (and thus the change has impact). As a consequence, I pretty strongly suspect that this should be *just* used at O3 at least for now.</div></div></div></blockquote><div class=""><br class=""></div><div class="">The correlation is there -- when there is performance improvement, there is size increase.</div></div></div></div></div></blockquote><div><br class=""></div><div>I didn’t quite get this impression from the graph, the highest improvement didn’t come with code size increase:</div><div><br class=""></div><div><img apple-inline="yes" id="17FCBC2A-04A2-4737-B05E-5D7B8522FA0E" src="cid:5A91AD70-5A91-4149-B158-089D2AA9210C@hsd1.ca.comcast.net." class=""></div><div><br class=""></div><div><br class=""></div><div><br class=""></div><div>And on the other hand there were many code-size increase without any runtime improvement. </div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""> The opposite is not true -- but that is expected. If the speedup is in the cold path, there won't be visible performance improvement but size increase.</div><div class=""><br class=""></div><div class="">Put it another way. If we reduce the threshold, there will be sizable size improvement for many benchmarks without regressing performance, shall we use the reduced threshold for O2 instead?</div></div></div></div></div></blockquote><div><br class=""></div><div>Yes, all the ones here IIUC:</div><div><br class=""></div><div><img apple-inline="yes" id="0A8FD80C-5340-40F2-9364-F0AA4F1F912F" src="cid:07371745-F6BD-4E58-A520-BEB9D7A85100@hsd1.ca.comcast.net." class=""></div><div><br class=""></div><div><br class=""></div><div><br class=""></div><div>However it is likely that we could consider these “small” benchmarks should use -Os if they're sensitive to size, and so O2 would be fine with the more aggressive threshold (as larger program aren’t affected).</div><div><br class=""></div><div>With good heuristic we’d have every dot forming a straight line   code_size_increase = m * runtime_perf (with m as small as possible). The current lack of shape (or the exact opposite distribution to the ideal I imagine above) seems to show that our "profitability” heuristics are pretty bad and the current threshold knob is bad predictor of the runtime performance. </div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class="">It is usually tiny programs that are sensitive (size) to this change. The size vs size increase chart confirms that point. There is basically no large size increase for programs > 1MB (clang release build size is 78M). In other words, I believe the actual size impact on real world applications should be negligible.  This behavior is very different from the case when we increase inline threshold for instance -- which will have size impact across the board. The latter is certainly more limited to higher optimization levels.</div><div class=""><br class=""></div><div class="">thanks,</div><div class=""><br class=""></div><div class="">David</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr" class=""><div class="gmail_quote"><div class=""><br class=""></div><div class="">I see two further directions for Dehao that make sense here (at least to me):</div><div class="">1) I suspect we should investigate *why* the size increases are happening without helping speed. I can imagine some reasons that this would of course happen (cold loops getting unrolled), but especially in light of the oddities you point out above, I suspect there may be issues where more unrolling is uncovering other problems and if we fix those other problems the shape of things will be different. We should at least address the issues you uncovered above.</div><div class=""><br class=""></div><div class="">2) If this turns out to be architecture specific (it seems that way at least initially, but hard to tell for sure with different benchmark sets) we might make AArch64 and x86 use different thresholds here. I'm skeptical about this though. I suspect we should do #1, and we'll either get a different shape, or just decide that O3 is more appropriate.</div><span class=""><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">Maybe the compile time impact is what should be driving that discussion the most? I'm afraid I don't have compile time numbers.</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">FWIW, I strongly suspect that for *this* change, compile time and code size will be pretty precisely correlated. Dehao's data shows that to be true in several cases certainly.</div><span class=""><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg">Ultimately, I guess this boils down to what exactly the difference is in intent between O2 and O3, which seems like a never-ending discussion...</div></div></div></blockquote><div class=""><br class=""></div></span><div class="">The definitions I am working from are here:</div><div class=""><a href="https://github.com/llvm-project/llvm-project/blob/master/llvm/include/llvm/Passes/PassBuilder.h#L81-L90" target="_blank" class="">https://github.com/llvm-<wbr class="">project/llvm-project/blob/<wbr class="">master/llvm/include/llvm/<wbr class="">Passes/PassBuilder.h#L81-L90</a><br class=""></div><div class=""><br class=""></div><div class="">I've highlighted the part that makes me think O3 is better here: the code size increases (and thus compile time increases) don't seem to correspond to runtime improvements.</div><span class=""><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div>Hoping you find this useful,</div></div></blockquote><div class=""><br class=""></div></span><div class="">Very. Once again, this kind of data and analysis is awesome. =D </div><span class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg">Kristof</div></div><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;"><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"><blockquote type="cite" class="m_-5579311497984940082gmail_msg"><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"><div class="gmail_quote m_-5579311497984940082gmail_msg"><div dir="ltr" class="m_-5579311497984940082gmail_msg">On Tue, Feb 14, 2017 at 1:06 PM Kristof Beyls via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="m_-5579311497984940082gmail_msg" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br class="m_-5579311497984940082gmail_msg"></div><blockquote class="gmail_quote m_-5579311497984940082gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div class="m_-5579311497984940082gmail_msg" style="word-wrap: break-word;">I've run the patch on <a href="https://reviews.llvm.org/D28368" class="m_-5579311497984940082gmail_msg" target="_blank">https://reviews.llvm.org/<wbr class="">D28368</a> on the test-suite and other benchmarks, for AArch64 -O3 -fomit-frame-pointer, both for Cortex-A53 and Cortex-A57.<div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg">The geomean over the few hundred programs in there is roughly the same for Cortex-A53 and Cortex-A57: a bit over 1% improvement in execution speed for a bit over 5% increase in code size.</div><div class="m_-5579311497984940082gmail_msg">Obviously I wouldn't want this for optimization levels where code size is of any concern, like -Os or -Oz, but don't have a problem with this going in for other optimization levels where this isn't a concern.</div><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg">Thanks,</div><div class="m_-5579311497984940082gmail_msg"><br class="m_-5579311497984940082gmail_msg"></div><div class="m_-5579311497984940082gmail_msg">Kristof</div></div></blockquote></div></div></blockquote></div><br class="m_-5579311497984940082gmail_msg"></div></blockquote></span></div></div><br class="">______________________________<wbr class="">_________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/llvm-dev</a><br class=""><br class=""></blockquote></div><br class=""></div></div><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">LLVM Developers mailing list</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="mailto:llvm-dev@lists.llvm.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">llvm-dev@lists.llvm.org</a><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></blockquote></div><br class=""></body></html>