<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 08/17/2018 09:02 AM, Amara Emerson
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:996BD3D4-EA0C-49AF-BB84-B159580F8129@apple.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <br class="">
      <div>
        <blockquote type="cite" class="">
          <div class="">On Aug 15, 2018, at 10:57 PM, Hal Finkel via
            llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org"
              class="" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <meta http-equiv="Content-Type" content="text/html;
              charset=utf-8" class="">
            <div text="#000000" bgcolor="#FFFFFF" class=""> <br
                class="">
              <div class="moz-cite-prefix">On 08/15/2018 02:38 PM,
                Philip Reames via llvm-dev wrote:<br class="">
              </div>
              <blockquote type="cite"
                cite="mid:7b7d457d-dbf4-d64b-428d-7e47f84c32ef@philipreames.com"
                class="">
                <meta http-equiv="Content-Type" content="text/html;
                  charset=utf-8" class="">
                <p class="">I'm concerned that we're focusing on one
                  side of this.  Let me point out a few concerns
                  w/changing the canonical form here:</p>
                <ol class="">
                  <li class="">LICM does not know how to hoist or sink
                    regions.  It does know how to hoist and sink
                    selects.</li>
                </ol>
              </blockquote>
            </div>
          </div>
        </blockquote>
        I think we should teach LICM to do this eventually.<br class="">
      </div>
    </blockquote>
    Agreed.  I just don't currently see patches to do so.  Once LICM
    supports region hoisting, much of my concern disappears.<br>
    <blockquote type="cite"
      cite="mid:996BD3D4-EA0C-49AF-BB84-B159580F8129@apple.com">
      <div>
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <blockquote type="cite"
                cite="mid:7b7d457d-dbf4-d64b-428d-7e47f84c32ef@philipreames.com"
                class="">
                <ol class="" start="2">
                  <li class="">InstCombine has limited support for
                    triangles/diamonds, but fairly extensive support for
                    selects.</li>
                  <li class="">EarlyCSE and GVN do not know how to
                    eliminate fully redundant triangles/diamonds.  PRE
                    *may* get some of these cases, but that is by no
                    means guaranteed or likely to be robust.</li>
                </ol>
              </blockquote>
            </div>
          </div>
        </blockquote>
        Agreed, we’ll need a plan to deal with these issues.<br class="">
      </div>
    </blockquote>
    Again, I'd love to see these issues fixed.  Once that's done, the
    convert about phi vs select as canonical form is much less
    relevant.  <br>
    <blockquote type="cite"
      cite="mid:996BD3D4-EA0C-49AF-BB84-B159580F8129@apple.com">
      <div>
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <blockquote type="cite"
                cite="mid:7b7d457d-dbf4-d64b-428d-7e47f84c32ef@philipreames.com"
                class="">
                <ol class="" start="2">
                </ol>
                <p class="">My personal opinion is that selects are the
                  appropriate canonical form.  </p>
                <p class="">For the one of the specific cases mentioned,
                  teaching GVN to do FRE and PRE for loads from selects
                  of pointers just doesn't seem that painful.  I would
                  be really tempted to do that instead.  Similarly,
                  walking facts back from select uses in CVP seems
                  generally useful since we have use dependent facts in
                  a bunch of contexts, not just selects.  (Call
                  arguments for instance,  non-null from unconditional
                  deref, etc..)<br class="">
                </p>
                <p class="">To be clear, I am raising concerns, not
                  actively objecting to this.  If you want to move
                  forward and commit to work through all of the issues
                  identified I wont actively stand in the way.</p>
              </blockquote>
              <br class="">
              As I've expressed in the past, I think that not using
              select as part of our canonical form is potentially a
              superior design. However, it would be a major change. In
              addition to the issues that Philip mentions, there's also
              the fact that we'll just have more basic blocks and that,
              in itself, could lead to an increase in compile time.
              However, working through these issues will likely leave us
              with a more-robust optimizer.<br class="">
              <br class="">
              See also: <a
                href="https://bugs.llvm.org/show_bug.cgi?id=34603#c19"
                class="" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=34603#c19</a><br
                class="">
            </div>
          </div>
        </blockquote>
        Canonicalizing to phis is my preference too.<br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class=""> <br
                class="">
               -Hal<br class="">
              <br class="">
              <blockquote type="cite"
                cite="mid:7b7d457d-dbf4-d64b-428d-7e47f84c32ef@philipreames.com"
                class="">
                <p class="">Philip<br class="">
                </p>
                <br class="">
                <div class="moz-cite-prefix">On 08/14/2018 12:39 PM,
                  Sanjay Patel via llvm-dev wrote:<br class="">
                </div>
                <blockquote type="cite"
cite="mid:CA+wODitU1d+Fme24EhhwNkX6tcQzpiRnrKCpeej11ELgt5heig@mail.gmail.com"
                  class="">
                  <div dir="ltr" class="">
                    <div class="">I didn't look closely at the new
                      patch, but this appears to be a small extension
                      to:</div>
                    <div class=""><a
                        href="https://reviews.llvm.org/D38566"
                        moz-do-not-send="true" class="">https://reviews.llvm.org/D38566</a></div>
                    <div class="">...and the GVN-based reasons for
                      delaying transformation to 'select' are discussed
                      in detail in the motivating bug for that patch:</div>
                    <div class=""><a
                        href="https://bugs.llvm.org/show_bug.cgi?id=34603"
                        moz-do-not-send="true" class="">https://bugs.llvm.org/show_bug.cgi?id=34603</a></div>
                    <div class=""><br class="">
                    </div>
                    <div class="">So this sounds like the right
                      direction to me. Note that there was objection to
                      the implementation (a pile of pass options vs.
                      uniquely named passes).<br class="">
                    </div>
                    <div class=""><br class="">
                    </div>
                    <div class="">Here's another motivating bug where
                      early transform to select prevents optimization:</div>
                    <div class=""><a
                        href="https://bugs.llvm.org/show_bug.cgi?id=36760"
                        moz-do-not-send="true" class="">https://bugs.llvm.org/show_bug.cgi?id=36760</a><br
                        class="">
                    </div>
                    <div class=""><br class="">
                    </div>
                    <div class="">Is that case affected by this patch?<br
                        class="">
                    </div>
                    <div class=""><br class="">
                    </div>
                  </div>
                  <div class="gmail_extra"><br class="">
                    <div class="gmail_quote">On Tue, Aug 14, 2018 at
                      11:17 AM, John Brawn via llvm-dev <span dir="ltr"
                        class=""><<a
                          href="mailto:llvm-dev@lists.llvm.org"
                          target="_blank" moz-do-not-send="true"
                          class="">llvm-dev@lists.llvm.org</a>></span>
                      wrote:<br class="">
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">Summary<br class="">
                        =======<br class="">
                        <br class="">
                        I'm planning on adjusting SimplifyCFG so that it
                        doesn't turn two-entry phi<br class="">
                        nodes into selects until later in the pass
                        pipeline, to give passes which can<br class="">
                        understand phis but not selects more opportunity
                        to optimize. The thing I'm<br class="">
                        trying to do which made me think of doing this
                        is described below, but from the<br class="">
                        benchmarking I've done it looks like this is
                        overall a good idea regardless of<br class="">
                        if I manage to get that done or not.<br class="">
                        <br class="">
                        Motivation<br class="">
                        ==========<br class="">
                        <br class="">
                        My goal is to get clang to optimize some code
                        containing a call to<br class="">
                        std::min_element which is dereferenced, so
                        something like:<br class="">
                        <br class="">
                          float min_element_example(float *data, int
                        size)<br class="">
                          {<br class="">
                            return *std::min_element(data, data+size);<br
                          class="">
                          }<br class="">
                        <br class="">
                        which, after inlining a specialization, looks
                        like:<br class="">
                        <br class="">
                          float min_element_example_inlined(<wbr
                          class="">float *first, float *last)<br
                          class="">
                          {<br class="">
                            for (float *p = first; p != last; ++p)<br
                          class="">
                            {<br class="">
                              if (*p < *first)<br class="">
                                first = p;<br class="">
                            }<br class="">
                            return *first;<br class="">
                          }<br class="">
                        <br class="">
                        There are two loads in the loop, *p and *first,
                        but actually the load *p can be<br class="">
                        eliminated by using either the previous load *p
                        or the previous *first,<br class="">
                        depending on if the if-condition was taken or
                        not. However the<br class="">
                        "if (*p < *first) first = p" gets turned by
                        simplifycfg into a select and this<br class="">
                        makes optimizing this a lot harder because you
                        no longer have distinct paths<br class="">
                        through the CFG.<br class="">
                        <br class="">
                        I have some ideas on how to do the optimization
                        (see my previous RFC "Making GVN<br class="">
                        able to visit the same block more than once"
                        posted in April, though I've<br class="">
                        decided that the specific idea presented there
                        isn't the right way to do it),<br class="">
                        but I think the first step is to make sure we
                        don't have a select when we try<br class="">
                        to optimise this.<br class="">
                        <br class="">
                        Approach<br class="">
                        ========<br class="">
                        <br class="">
                        I've posted a patch to <a
                          href="https://reviews.llvm.org/D50723"
                          rel="noreferrer" target="_blank"
                          moz-do-not-send="true" class="">https://reviews.llvm.org/<wbr
                            class="">D50723</a> showing what I'm<br
                          class="">
                        intending to do. An extra parameter is added to
                        SimplifyCFG to control whether<br class="">
                        two-entry phi nodes are converted into select,
                        and this is set to false in all<br class="">
                        instances before the end of module
                        simplification. At the end of module<br class="">
                        simplification we do SimplifyCFG, then
                        Instcombine to optimise the selects that<br
                          class="">
                        are introduced, then EarlyCSE to eliminate
                        common subexpressions introduced by<br class="">
                        instcombine.<br class="">
                      </blockquote>
                    </div>
                  </div>
                </blockquote>
              </blockquote>
            </div>
          </div>
        </blockquote>
        Is scheduling another simplifycfg, instcombine + earlycse pass
        really necessary? I’m concerned about the compile time impact.</div>
      <div><br class="">
      </div>
      <div>Cheers,</div>
      <div>Amara<br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <blockquote type="cite"
                cite="mid:7b7d457d-dbf4-d64b-428d-7e47f84c32ef@philipreames.com"
                class="">
                <blockquote type="cite"
cite="mid:CA+wODitU1d+Fme24EhhwNkX6tcQzpiRnrKCpeej11ELgt5heig@mail.gmail.com"
                  class="">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex"> <br class="">
                        Benchmark Results<br class="">
                        =================<br class="">
                        <br class="">
                        These are performance differences reported by
                        LNT when running llvm-test-suite,<br class="">
                        spec2000, and spec2006 at -O3 with and without
                        the patch linked above (using<br class="">
                        trunk llvm from a week or so ago).<br class="">
                        <br class="">
                        AArch64 results on ARM Cortex-A72:<br class="">
                        <br class="">
                        Performance Regressions - execution_time       
                                              Change<br class="">
                        SingleSource/Benchmarks/<wbr class="">Shootout/Shootout-ary3 
                                               9.48%<br class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">Packing-flt/Packing-flt 
                                          3.79%<br class="">
                        SingleSource/Benchmarks/<wbr class="">CoyoteBench/huffbench 
                                                1.40%<br class="">
                        <br class="">
                        Performance Improvements - execution_time       
                                             Change<br class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">Searching-dbl/Searching-dbl 
                                    -23.74%<br class="">
                        External/SPEC/CINT2000/256.<wbr class="">bzip2/256.bzip2 
                                                  -9.82%<br class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">Searching-flt/Searching-flt 
                                     -9.57%<br class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">Equivalencing-flt/<wbr
                          class="">Equivalencing-flt       -4.38%<br
                          class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">LinearDependence-flt/<wbr
                          class="">LinearDependence-flt -3.94%<br
                          class="">
                        MultiSource/Benchmarks/TSVC/<wbr class="">Packing-dbl/Packing-dbl 
                                         -3.44%<br class="">
                        External/SPEC/CFP2006/453.<wbr class="">povray/453.povray 
                                                 -2.50%<br class="">
                        SingleSource/Benchmarks/Adobe-<wbr class="">C++/stepanov_vector 
                                           -1.49%<br class="">
                        <br class="">
                        X86_64 results on Intel Xeon E5-2690:<br
                          class="">
                        <br class="">
                        Performance Regressions - execution_time       
                           Change<br class="">
                        MultiSource/Benchmarks/<wbr class="">Ptrdist/yacr2/yacr2 
                                5.62%<br class="">
                        <br class="">
                        Performance Improvements - execution_time       
                          Change<br class="">
                        SingleSource/Benchmarks/Misc-<wbr class="">C++/Large/sphereflake
                        -4.43%<br class="">
                        External/SPEC/CINT2006/456.<wbr class="">hmmer/456.hmmer 
                               -2.50%<br class="">
                        External/SPEC/CINT2006/464.<wbr class="">h264ref/464.h264ref 
                           -1.60%<br class="">
                        MultiSource/Benchmarks/nbench/<wbr class="">nbench 
                                     -1.19%<br class="">
                        SingleSource/Benchmarks/Adobe-<wbr class="">C++/functionobjects 
                        -1.07%<br class="">
                        <br class="">
                        I had a brief look at the regressions and they
                        all look to be caused by<br class="">
                        getting bad luck with branch mispredictions: I
                        looked into the Shootout-ary3 and<br class="">
                        yacr2 cases and in both the hot code path was
                        the same with and without the<br class="">
                        patch, but with more mispredictions probably
                        caused by changes elsewhere.<br class="">
                        <br class="">
                        John<br class="">
                        <br class="">
                        ______________________________<wbr class="">_________________<br
                          class="">
                        LLVM Developers mailing list<br class="">
                        <a href="mailto:llvm-dev@lists.llvm.org"
                          moz-do-not-send="true" class="">llvm-dev@lists.llvm.org</a><br
                          class="">
                        <a
                          href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                          rel="noreferrer" target="_blank"
                          moz-do-not-send="true" class="">http://lists.llvm.org/cgi-bin/<wbr
                            class="">mailman/listinfo/llvm-dev</a><br
                          class="">
                      </blockquote>
                    </div>
                    <br class="">
                  </div>
                  <br class="">
                  <fieldset class="mimeAttachmentHeader"></fieldset>
                  <br class="">
                  <pre class="" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
                </blockquote>
                <br class="">
                <!--'"--><br class="">
                <fieldset class="mimeAttachmentHeader"></fieldset>
                <br class="">
                <pre class="" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
              </blockquote>
              <br class="">
              <pre class="moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
            </div>
            _______________________________________________<br class="">
            LLVM Developers mailing list<br class="">
            <a href="mailto:llvm-dev@lists.llvm.org" class=""
              moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br
              class="">
            <a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br
              class="">
          </div>
        </blockquote>
      </div>
      <br class="">
    </blockquote>
    <br>
  </body>
</html>