<div dir="ltr">Honestly, I'm not a huge fan of this change as-is. The set of transforms that were added behind ExpensiveChecks seems awfully strange and many would not lead the reader to believe that they are expensive at all (the SimplifyDemandedInstructionBits and foldICmpUsingKnownBits calls being the obvious expensive routines).<div><br></div><div>The purpose of many of InstCombine's xforms is to canonicalize the IR to make life easier for downstream passes and analyses.</div><div><br></div><div>InstCombine internally relies on knowing what transforms it may or may not perform. This is important: canonicalizations may duel endlessly if we get this wrong; the order of the combines is also important for exactly the same reason (SelectionDAG deals with this problem in a different way with its pattern complexity field).</div><div><br></div><div>Another concern with moving seemingly arbitrary combines under ExpensiveCombines is that it will make it that much harder to understand what is and is not canonical at a given point during the execution of the optimizer.</div><div><br></div><div>I'd be much more interested in a patch which caches the result of frequently called ValueTracking functionality like ComputeKnownBits, ComputeSignBit, etc. which often doesn't change but is not intelligently reused. I imagine that the performance win might be quite comparable. Such a patch would have the benefit of keeping the set of available transforms constant throughout the pipeline while bringing execution time down; I wouldn't be at all surprised if caching the ValueTracking functions resulted in a bigger time savings.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 17, 2017 at 5:49 PM, Hal Finkel via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class="">
<p><br>
</p>
<div class="m_-5665328138797978423moz-cite-prefix">On 03/17/2017 04:30 PM, Mehdi Amini via
llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<br>
<div>
<blockquote type="cite">
<div>On Mar 17, 2017, at 11:50 AM, Mikhail Zolotukhin
via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:</div>
<br class="m_-5665328138797978423Apple-interchange-newline">
<div>
<div style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div>Hi,</div>
<div><br>
</div>
<div>One of the most time-consuming passes
in LLVM middle-end is InstCombine (see e.g. [1]).
It is a very powerful pass capable of doing all
the crazy stuff, and new patterns are being
constantly introduced there. The problem is that
we often use it just as a clean-up pass: it's
scheduled 6 times in the current pass pipeline,
and each time it's invoked it checks all known
patterns. It sounds ok for O3, where we try to
squeeze as much performance as possible, but it is
too excessive for other opt-levels. InstCombine
has an ExpensiveCombines parameter to address that
- but I think it's underused at the moment.</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Yes, the “ExpensiveCombines” has been added recently (4.0?
3.9?) but I believe has always been intended to be extended
the way you’re doing it. So I support this effort :)</div>
</div>
</blockquote>
<br></span>
+1<br>
<br>
Also, did your profiling reveal why the other combines are
expensive? Among other things, I'm curious if the expensive ones
tend to spend a lot of time in ValueTracking (getting known bits and
similar)?<br>
<br>
-Hal<div><div class="h5"><br>
<br>
<blockquote type="cite">
<div>
<div><br>
</div>
<div>CC: David for the general direction on InstCombine though.</div>
<div><br>
</div>
<div><br>
</div>
<div>— </div>
<div>Mehdi</div>
<div><br>
</div>
<div><br>
</div>
<br>
<blockquote type="cite">
<div>
<div style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div><br>
</div>
<div>Trying to find out, which patterns are
important, and which are rare, I profiled clang
using CTMark and got the following coverage
report:</div>
</div>
</div>
</div>
</div>
<span id="m_-5665328138797978423cid:CEA3012D-A9E2-4318-AA90-C372667C70A9@apple.com"><InstCombine_covreport.html></span>
<div style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div>
<div>(beware, the file is ~6MB).</div>
</div>
<div><br>
</div>
<div>Guided by this profile I moved some
patterns under the "if (ExpensiveCombines)" check,
which expectedly happened to be neutral for
runtime performance, but improved compile-time.
The testing results are below (measured for Os).</div>
<div><br>
</div>
<div>
<table style="font-family:Helvetica,sans-serif;font-size:9pt;border-spacing:0px;border:1px solid black">
<thead><tr>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px;width:500px">Performance Improvements
- Compile Time</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Δ </th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Previous</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Current</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">σ </th>
</tr>
</thead><tbody class="m_-5665328138797978423searchable">
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2" target="_blank">CTMark/sqlite3/sqlite3</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(200,255,200)">-1.55%</td>
<td style="padding:5px 5px 5px 8px">6.8155</td>
<td style="padding:5px 5px 5px 8px">6.7102</td>
<td style="padding:5px 5px 5px 8px">0.0081</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.1=2" target="_blank">CTMark/mafft/pairlocalalign</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(209,255,209)">-1.05%</td>
<td style="padding:5px 5px 5px 8px">8.0407</td>
<td style="padding:5px 5px 5px 8px">7.9559</td>
<td style="padding:5px 5px 5px 8px">0.0193</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2" target="_blank">CTMark/ClamAV/clamscan</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(210,255,210)">-1.02%</td>
<td style="padding:5px 5px 5px 8px">11.3893</td>
<td style="padding:5px 5px 5px 8px">11.2734</td>
<td style="padding:5px 5px 5px 8px">0.0081</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.10=2" target="_blank">CTMark/lencod/lencod</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(210,255,210)">-1.01%</td>
<td style="padding:5px 5px 5px 8px">12.8763</td>
<td style="padding:5px 5px 5px 8px">12.7461</td>
<td style="padding:5px 5px 5px 8px">0.0244</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.5=2" target="_blank">CTMark/SPASS/SPASS</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(210,255,210)">-1.01%</td>
<td style="padding:5px 5px 5px 8px">12.5048</td>
<td style="padding:5px 5px 5px 8px">12.3791</td>
<td style="padding:5px 5px 5px 8px">0.0340</td>
</tr>
</tbody>
</table>
<div><br>
</div>
<table style="font-family:Helvetica,sans-serif;font-size:9pt;border-spacing:0px;border:1px solid black">
<thead><tr>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px;width:500px">Performance Improvements
- Compile Time</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Δ </th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Previous</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">Current</th>
<th style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px 5px 5px 8px">σ </th>
</tr>
</thead><tbody class="m_-5665328138797978423searchable">
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2" target="_blank">External/SPEC/CINT2006/403.<wbr>gcc/403.gcc</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(199,255,199)">-1.64%</td>
<td style="padding:5px 5px 5px 8px">54.0801</td>
<td style="padding:5px 5px 5px 8px">53.1930</td>
<td style="padding:5px 5px 5px 8px">-</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2" target="_blank">External/SPEC/CINT2006/400.<wbr>perlbench/400.perlbench</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(205,255,205)">-1.25%</td>
<td style="padding:5px 5px 5px 8px">19.1481</td>
<td style="padding:5px 5px 5px 8px">18.9091</td>
<td style="padding:5px 5px 5px 8px">-</td>
</tr>
<tr>
<td class="m_-5665328138797978423benchmark-name" style="padding:5px 5px 5px 8px"><a href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2" target="_blank">External/SPEC/CINT2006/445.<wbr>gobmk/445.gobmk</a></td>
<td style="padding:5px 5px 5px 8px;background-color:rgb(210,255,210)">-1.01%</td>
<td style="padding:5px 5px 5px 8px">15.2819</td>
<td style="padding:5px 5px 5px 8px">15.1274</td>
<td style="padding:5px 5px 5px 8px">-</td>
</tr>
</tbody>
</table>
<div><br>
</div>
<div><br>
</div>
<div>Do such changes make sense? The
patch doesn't change O3, but it does change Os
and potentially can change performance there
(though I didn't see any changes in my tests).</div>
</div>
<div><br>
</div>
<div>The patch is attached for the
reference, if we decide to go for it, I'll upload
it to phab:</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
<span id="m_-5665328138797978423cid:2A77C0D9-EE12-4C99-99A0-7A0CF5DF758A@apple.com"><0001-InstCombine-Move-some-<wbr>infrequent-patterns-under-if-<wbr>E.patch></span>
<div style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div dir="auto" style="word-wrap:break-word">
<div><br>
</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Michael</div>
<div><br>
</div>
<div>
<div>[1]: <a href="http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/2016-<wbr>December/108279.html</a></div>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a class="m_-5665328138797978423moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
</div>
</blockquote>
</div>
<br>
<br>
<fieldset class="m_-5665328138797978423mimeAttachmentHeader"></fieldset>
<br>
<pre>______________________________<wbr>_________________
LLVM Developers mailing list
<a class="m_-5665328138797978423moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="m_-5665328138797978423moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</div></div><span class="HOEnZb"><font color="#888888"><pre class="m_-5665328138797978423moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</font></span></div>
<br>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div>