<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Mar 17, 2017, at 2:02 PM, Vedant Kumar <<a href="mailto:vsk@apple.com" class="">vsk@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class=""><br class="Apple-interchange-newline">On Mar 17, 2017, at 11:50 AM, Mikhail Zolotukhin via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">Hi,<br class=""><br class="">One of the most time-consuming passes in LLVM middle-end is InstCombine (see e.g. [1]). It is a very powerful pass capable of doing all the crazy stuff, and new patterns are being constantly introduced there. The problem is that we often use it just as a clean-up pass: it's scheduled 6 times in the current pass pipeline, and each time it's invoked it checks all known patterns. It sounds ok for O3, where we try to squeeze as much performance as possible, but it is too excessive for other opt-levels. InstCombine has an ExpensiveCombines parameter to address that - but I think it's underused at the moment.<br class=""><br class="">Trying to find out, which patterns are important, and which are rare, I profiled clang using CTMark and got the following coverage report:<br class=""><InstCombine_covreport.html><br class="">(beware, the file is ~6MB).<br class=""><br class="">Guided by this profile I moved some patterns under the "if (ExpensiveCombines)" check, which expectedly happened to be neutral for runtime performance, but improved compile-time. The testing results are below (measured for Os).<br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">It'd be nice to double-check that any runtime performance loss at -O2 is negligible. But this sounds like a great idea!</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote>I forgot to mention that I ran SPEC2006/INT with "-Os" on ARM64 and didn't see any changes in runtime performance. I can run O2 testing as well over the weekend.</div><div><br class=""></div><div>Michael<br class=""><blockquote type="cite" class=""><div class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">vedant</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">Performance Improvements - Compile Time<span class="Apple-tab-span" style="white-space: pre;"> </span>Δ<span class="Apple-converted-space"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span>Previous<span class="Apple-tab-span" style="white-space: pre;"> </span>Current<span class="Apple-tab-span" style="white-space: pre;"> </span>σ<span class="Apple-converted-space"> </span><br class="">CTMark/sqlite3/sqlite3<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.55%<span class="Apple-tab-span" style="white-space: pre;"> </span>6.8155<span class="Apple-tab-span" style="white-space: pre;"> </span>6.7102<span class="Apple-tab-span" style="white-space: pre;"> </span>0.0081<br class="">CTMark/mafft/pairlocalalign<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.05%<span class="Apple-tab-span" style="white-space: pre;"> </span>8.0407<span class="Apple-tab-span" style="white-space: pre;"> </span>7.9559<span class="Apple-tab-span" style="white-space: pre;"> </span>0.0193<br class="">CTMark/ClamAV/clamscan<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.02%<span class="Apple-tab-span" style="white-space: pre;"> </span>11.3893<span class="Apple-tab-span" style="white-space: pre;"> </span>11.2734<span class="Apple-tab-span" style="white-space: pre;"> </span>0.0081<br class="">CTMark/lencod/lencod<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.01%<span class="Apple-tab-span" style="white-space: pre;"> </span>12.8763<span class="Apple-tab-span" style="white-space: pre;"> </span>12.7461<span class="Apple-tab-span" style="white-space: pre;"> </span>0.0244<br class="">CTMark/SPASS/SPASS<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.01%<span class="Apple-tab-span" style="white-space: pre;"> </span>12.5048<span class="Apple-tab-span" style="white-space: pre;"> </span>12.3791<span class="Apple-tab-span" style="white-space: pre;"> </span>0.0340<br class=""><br class="">Performance Improvements - Compile Time<span class="Apple-tab-span" style="white-space: pre;"> </span>Δ<span class="Apple-converted-space"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span>Previous<span class="Apple-tab-span" style="white-space: pre;"> </span>Current<span class="Apple-tab-span" style="white-space: pre;"> </span>σ<span class="Apple-converted-space"> </span><br class="">External/SPEC/CINT2006/403.gcc/403.gcc<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.64%<span class="Apple-tab-span" style="white-space: pre;"> </span>54.0801<span class="Apple-tab-span" style="white-space: pre;"> </span>53.1930<span class="Apple-tab-span" style="white-space: pre;"> </span>-<br class="">External/SPEC/CINT2006/400.perlbench/400.perlbench<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.25%<span class="Apple-tab-span" style="white-space: pre;"> </span>19.1481<span class="Apple-tab-span" style="white-space: pre;"> </span>18.9091<span class="Apple-tab-span" style="white-space: pre;"> </span>-<br class="">External/SPEC/CINT2006/445.gobmk/445.gobmk<span class="Apple-tab-span" style="white-space: pre;"> </span>-1.01%<span class="Apple-tab-span" style="white-space: pre;"> </span>15.2819<span class="Apple-tab-span" style="white-space: pre;"> </span>15.1274<span class="Apple-tab-span" style="white-space: pre;"> </span>-<br class=""><br class=""><br class="">Do such changes make sense? The patch doesn't change O3, but it does change Os and potentially can change performance there (though I didn't see any changes in my tests).<br class=""><br class="">The patch is attached for the reference, if we decide to go for it, I'll upload it to phab:<br class=""><br class=""><0001-InstCombine-Move-some-infrequent-patterns-under-if-E.patch><br class=""><br class=""><br class="">Thanks,<br class="">Michael<br class=""><br class="">[1]: <a href="http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html" class="">http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html</a><br class=""><br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote></div></blockquote></div><br class=""></body></html>