<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 03/17/2017 08:12 PM, David Majnemer
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAL7bZ_dvK5OigbEa1pPmNc3hz3uHP3c1rObofZe7i01bO6tLfw@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr">Honestly, I'm not a huge fan of this change as-is.
        The set of transforms that were added behind ExpensiveChecks
        seems awfully strange and many would not lead the reader to
        believe that they are expensive at all
        (the SimplifyDemandedInstructionBits and foldICmpUsingKnownBits
        calls being the obvious expensive routines).
        <div><br>
        </div>
        <div>The purpose of many of InstCombine's xforms is to
          canonicalize the IR to make life easier for downstream passes
          and analyses.</div>
        <div><br>
        </div>
        <div>InstCombine internally relies on knowing what transforms it
          may or may not perform. This is important: canonicalizations
          may duel endlessly if we get this wrong; the order of the
          combines is also important for exactly the same reason
          (SelectionDAG deals with this problem in a different way with
          its pattern complexity field).</div>
        <div><br>
        </div>
        <div>Another concern with moving seemingly arbitrary combines
          under ExpensiveCombines is that it will make it that much
          harder to understand what is and is not canonical at a given
          point during the execution of the optimizer.</div>
      </div>
    </blockquote>
    <br>
    I agree with this up to a point. If we have these kinds of
    canonicalization dependencies that depend on ValueTracking's depth,
    this seems very fragile. Even if we introduce caching, thus making
    the depth often infinite, if it will still be finite in the face of
    updates then we still need to be careful (plus, if we're worried
    about being able to understand the canonical form, then depending on
    known-bits analysis can make defining this form subtle).<br>
    <blockquote
cite="mid:CAL7bZ_dvK5OigbEa1pPmNc3hz3uHP3c1rObofZe7i01bO6tLfw@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>I'd be much more interested in a patch which caches the
          result of frequently called ValueTracking functionality like
          ComputeKnownBits, ComputeSignBit, etc. which often doesn't
          change but is not intelligently reused. I imagine that the
          performance win might be quite comparable. Such a patch would
          have the benefit of keeping the set of available transforms
          constant throughout the pipeline while bringing execution time
          down; I wouldn't be at all surprised if caching the
          ValueTracking functions resulted in a bigger time savings.</div>
      </div>
    </blockquote>
    <br>
    I'd started working on this a few months ago; I didn't finish the
    patch (mostly because I discovered that there's also a need to
    invalidate the cache whenever to perform a transformation that drops
    nsw/nuw flags and I've never got to that part), but I'm happy to
    provide my work-in-progress to anyone interested. cc'ing Davide who
    had also expressed an interest in this.<br>
    <br>
     -Hal<br>
    <br>
    <blockquote
cite="mid:CAL7bZ_dvK5OigbEa1pPmNc3hz3uHP3c1rObofZe7i01bO6tLfw@mail.gmail.com"
      type="cite">
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Fri, Mar 17, 2017 at 5:49 PM, Hal
          Finkel via llvm-dev <span dir="ltr"><<a
              moz-do-not-send="true"
              href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"><span class="">
                <p><br>
                </p>
                <div class="m_-5665328138797978423moz-cite-prefix">On
                  03/17/2017 04:30 PM, Mehdi Amini via llvm-dev wrote:<br>
                </div>
                <blockquote type="cite"> <br>
                  <div>
                    <blockquote type="cite">
                      <div>On Mar 17, 2017, at 11:50 AM, Mikhail
                        Zolotukhin via llvm-dev <<a
                          moz-do-not-send="true"
                          href="mailto:llvm-dev@lists.llvm.org"
                          target="_blank">llvm-dev@lists.llvm.org</a>>
                        wrote:</div>
                      <br
                        class="m_-5665328138797978423Apple-interchange-newline">
                      <div>
                        <div style="word-wrap:break-word">
                          <div dir="auto" style="word-wrap:break-word">
                            <div dir="auto" style="word-wrap:break-word">
                              <div dir="auto"
                                style="word-wrap:break-word">
                                <div>Hi,</div>
                                <div><br>
                                </div>
                                <div>One of the most time-consuming
                                  passes in LLVM middle-end is
                                  InstCombine (see e.g. [1]). It is a
                                  very powerful pass capable of doing
                                  all the crazy stuff, and new patterns
                                  are being constantly introduced there.
                                  The problem is that we often use it
                                  just as a clean-up pass: it's
                                  scheduled 6 times in the current pass
                                  pipeline, and each time it's invoked
                                  it checks all known patterns. It
                                  sounds ok for O3, where we try to
                                  squeeze as much performance as
                                  possible, but it is too excessive for
                                  other opt-levels. InstCombine has an
                                  ExpensiveCombines parameter to address
                                  that - but I think it's underused at
                                  the moment.</div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                    <div><br>
                    </div>
                    <div>Yes, the “ExpensiveCombines” has been added
                      recently (4.0? 3.9?) but I believe has always been
                      intended to be extended the way you’re doing it.
                      So I support this effort :)</div>
                  </div>
                </blockquote>
                <br>
              </span> +1<br>
              <br>
              Also, did your profiling reveal why the other combines are
              expensive? Among other things, I'm curious if the
              expensive ones tend to spend a lot of time in
              ValueTracking (getting known bits and similar)?<br>
              <br>
               -Hal
              <div>
                <div class="h5"><br>
                  <br>
                  <blockquote type="cite">
                    <div>
                      <div><br>
                      </div>
                      <div>CC: David for the general direction on
                        InstCombine though.</div>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <div>— </div>
                      <div>Mehdi</div>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <br>
                      <blockquote type="cite">
                        <div>
                          <div style="word-wrap:break-word">
                            <div dir="auto" style="word-wrap:break-word">
                              <div dir="auto"
                                style="word-wrap:break-word">
                                <div dir="auto"
                                  style="word-wrap:break-word">
                                  <div><br>
                                  </div>
                                  <div>Trying to find out, which
                                    patterns are important, and which
                                    are rare, I profiled clang using
                                    CTMark and got the following
                                    coverage report:</div>
                                </div>
                              </div>
                            </div>
                          </div>
                          <span
id="m_-5665328138797978423cid:CEA3012D-A9E2-4318-AA90-C372667C70A9@apple.com"><InstCombine_covreport.html></span>
                          <div style="word-wrap:break-word">
                            <div dir="auto" style="word-wrap:break-word">
                              <div dir="auto"
                                style="word-wrap:break-word">
                                <div dir="auto"
                                  style="word-wrap:break-word">
                                  <div>
                                    <div>(beware, the file is ~6MB).</div>
                                  </div>
                                  <div><br>
                                  </div>
                                  <div>Guided by this profile I moved
                                    some patterns under the "if
                                    (ExpensiveCombines)" check, which
                                    expectedly happened to be neutral
                                    for runtime performance, but
                                    improved compile-time. The testing
                                    results are below (measured for Os).</div>
                                  <div><br>
                                  </div>
                                  <div>
                                    <table
style="font-family:Helvetica,sans-serif;font-size:9pt;border-spacing:0px;border:1px
                                      solid black">
                                      <thead><tr>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px;width:500px">Performance
                                            Improvements - Compile Time</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Δ </th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Previous</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Current</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">σ </th>
                                        </tr>
                                      </thead><tbody
                                        class="m_-5665328138797978423searchable">
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2"
                                              target="_blank">CTMark/sqlite3/sqlite3</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(200,255,200)">-1.55%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">6.8155</td>
                                          <td style="padding:5px 5px 5px
                                            8px">6.7102</td>
                                          <td style="padding:5px 5px 5px
                                            8px">0.0081</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.1=2"
                                              target="_blank">CTMark/mafft/pairlocalalign</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(209,255,209)">-1.05%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">8.0407</td>
                                          <td style="padding:5px 5px 5px
                                            8px">7.9559</td>
                                          <td style="padding:5px 5px 5px
                                            8px">0.0193</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2"
                                              target="_blank">CTMark/ClamAV/clamscan</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(210,255,210)">-1.02%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">11.3893</td>
                                          <td style="padding:5px 5px 5px
                                            8px">11.2734</td>
                                          <td style="padding:5px 5px 5px
                                            8px">0.0081</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.10=2"
                                              target="_blank">CTMark/lencod/lencod</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(210,255,210)">-1.01%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">12.8763</td>
                                          <td style="padding:5px 5px 5px
                                            8px">12.7461</td>
                                          <td style="padding:5px 5px 5px
                                            8px">0.0244</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.5=2"
                                              target="_blank">CTMark/SPASS/SPASS</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(210,255,210)">-1.01%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">12.5048</td>
                                          <td style="padding:5px 5px 5px
                                            8px">12.3791</td>
                                          <td style="padding:5px 5px 5px
                                            8px">0.0340</td>
                                        </tr>
                                      </tbody>
                                    </table>
                                    <div><br>
                                    </div>
                                    <table
style="font-family:Helvetica,sans-serif;font-size:9pt;border-spacing:0px;border:1px
                                      solid black">
                                      <thead><tr>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px;width:500px">Performance
                                            Improvements - Compile Time</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Δ </th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Previous</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">Current</th>
                                          <th
style="background-color:rgb(238,238,238);color:rgb(102,102,102);text-align:center;font-family:Verdana;padding:5px
                                            5px 5px 8px">σ </th>
                                        </tr>
                                      </thead><tbody
                                        class="m_-5665328138797978423searchable">
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2"
                                              target="_blank">External/SPEC/CINT2006/403.<wbr>gcc/403.gcc</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(199,255,199)">-1.64%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">54.0801</td>
                                          <td style="padding:5px 5px 5px
                                            8px">53.1930</td>
                                          <td style="padding:5px 5px 5px
                                            8px">-</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2"
                                              target="_blank">External/SPEC/CINT2006/400.<wbr>perlbench/400.perlbench</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(205,255,205)">-1.25%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">19.1481</td>
                                          <td style="padding:5px 5px 5px
                                            8px">18.9091</td>
                                          <td style="padding:5px 5px 5px
                                            8px">-</td>
                                        </tr>
                                        <tr>
                                          <td
                                            class="m_-5665328138797978423benchmark-name"
                                            style="padding:5px 5px 5px
                                            8px"><a
                                              moz-do-not-send="true"
                                              href="http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2"
                                              target="_blank">External/SPEC/CINT2006/445.<wbr>gobmk/445.gobmk</a></td>
                                          <td style="padding:5px 5px 5px
8px;background-color:rgb(210,255,210)">-1.01%</td>
                                          <td style="padding:5px 5px 5px
                                            8px">15.2819</td>
                                          <td style="padding:5px 5px 5px
                                            8px">15.1274</td>
                                          <td style="padding:5px 5px 5px
                                            8px">-</td>
                                        </tr>
                                      </tbody>
                                    </table>
                                    <div><br>
                                    </div>
                                    <div><br>
                                    </div>
                                    <div>Do such changes make sense? The
                                      patch doesn't change O3, but it
                                      does change Os and potentially can
                                      change performance there (though I
                                      didn't see any changes in my
                                      tests).</div>
                                  </div>
                                  <div><br>
                                  </div>
                                  <div>The patch is attached for the
                                    reference, if we decide to go for
                                    it, I'll upload it to phab:</div>
                                  <div><br>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                          <span
id="m_-5665328138797978423cid:2A77C0D9-EE12-4C99-99A0-7A0CF5DF758A@apple.com"><0001-InstCombine-Move-some-<wbr>infrequent-patterns-under-if-<wbr>E.patch></span>
                          <div style="word-wrap:break-word">
                            <div dir="auto" style="word-wrap:break-word">
                              <div dir="auto"
                                style="word-wrap:break-word">
                                <div dir="auto"
                                  style="word-wrap:break-word">
                                  <div><br>
                                  </div>
                                  <div><br>
                                  </div>
                                  <div>Thanks,</div>
                                  <div>Michael</div>
                                  <div><br>
                                  </div>
                                  <div>
                                    <div>[1]: <a moz-do-not-send="true"
href="http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html"
                                        target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/2016-<wbr>December/108279.html</a></div>
                                  </div>
                                  <div><br>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                          ______________________________<wbr>_________________<br>
                          LLVM Developers mailing list<br>
                          <a moz-do-not-send="true"
                            href="mailto:llvm-dev@lists.llvm.org"
                            target="_blank">llvm-dev@lists.llvm.org</a><br>
                          <a moz-do-not-send="true"
                            class="m_-5665328138797978423moz-txt-link-freetext"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                            target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
                        </div>
                      </blockquote>
                    </div>
                    <br>
                    <br>
                    <fieldset
                      class="m_-5665328138797978423mimeAttachmentHeader"></fieldset>
                    <br>
                    <pre>______________________________<wbr>_________________
LLVM Developers mailing list
<a moz-do-not-send="true" class="m_-5665328138797978423moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a moz-do-not-send="true" class="m_-5665328138797978423moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
    

    </div></div><span class="HOEnZb"><font color="#888888"><pre class="m_-5665328138797978423moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
  </font></span></div>


______________________________<wbr>_________________

LLVM Developers mailing list

<a moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a moz-do-not-send="true" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a>


</blockquote></div>
</div>



</blockquote>
<pre class="moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre></body></html>