<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 01/03/2017 10:01 AM, Daniel Berlin

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAF4BwTUmHL=1TW5XQ16yTWCSCU=f2WtWZ1yZwv=Y4Hro9qPLPQ@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <div dir="ltr"><br>

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Tue, Jan 3, 2017 at 5:43 AM, Hal

            Finkel via llvm-commits <span dir="ltr"><<a

                moz-do-not-send="true"

                href="mailto:llvm-commits@lists.llvm.org"

                target="_blank">llvm-commits@lists.llvm.org</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi

              Philip,<br>

              <br>

              First, I'm planning on rewriting this change to be more

              targeted, specifically to take the alternative approach we

              discussed briefly on the review thread and keep the list

              of affected variables in the assumption cache.</blockquote>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <br>

              Second, regarding motivation, the problem is not the

              basic-block scan (with the assumption cache, we actually

              do very little of that anyway). The core problem is that,

              in ValueTracking, which is used heavily by InstCombine and

              other passes, checks all assumptions in each function for

              every variable being processed to determine known bits,

              etc. These checks are pattern matches, and while they

              normally fail quickly, as the number of assumptions

              increases, this is a large problem: it makes known-bits

              determination O(#assumptions in the function), for every

              variable, in the common case, and, because the assumption

              processing is part of the recursive known-bits procedure,

              the constant factor can be large too given certain kinds

              of inputs. What we need for ValueTracking (and also for

              LVI and maybe others) is a way to cache which variables

              are (usefully) affected by which assumptions.<br>

              <br>

            </blockquote>

            <div><br>

            </div>

            <div>When you say affected, do you mean "what uses"?<br>

              Because your operand bundles change, does not really do

              that.</div>

            <div>For a given use I still have to find the assume (which

              is in the middle of a block), then go to the bundle, then

              look at the uses of the things in that bundle</div>

            <div>Or am i missing something?<br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    You're missing something. The "affected uses" are added by

    InstCombine by looking for patterns useful to ValueTracking, etc.

    So, for example, if we have:<br>

    <br>

    <pre wrap="">   %cmp1 = icmp eq i8 %a, 5

   call void @llvm.assume(i1 %cmp1)

</pre>

    Then InstCombine would come along and add the operand bundle like

    this:<br>

    <pre wrap="">

   call void @llvm.assume(i1 %cmp1) [ "affected"(i8 %a) ]</pre>

    So, if we're looking at %a, and want to know if there's some

    assumption that might be relevant to it, we just need to look at

    %a's users (instead of looking through all assumptions). <br>

    <br>

    It's certainly true that, as discussed below, once we find an assume

    in the users of %a, we need to verify the dominance relationship

    between the assume and %a, but this is still rare and has not yet

    presented itself as a problem as far as I can tell. The first

    problem is that checking whether some variable, %x, is potentially

    affected by an assume cannot be O(#assumes in the function). That's

    what ValueTracking, LVI, etc. do now and, for obvious reasons,

    causes problems as we make more-extensive use of assumes (or don't

    because of this performance problem).<br>

    <br>

    <blockquote

cite="mid:CAF4BwTUmHL=1TW5XQ16yTWCSCU=f2WtWZ1yZwv=Y4Hro9qPLPQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              Regarding your suggestions below, I can definitely see how

              we might run into a problem with the dominance checking if

              we had a large number of assumptions which affected a

              variable -- it is only when we have an assumption that

              potentially affects a variable that we do any of the

              dominance checking -- but I've not run into any such cases

              yet.<br>

            </blockquote>

            <div><br>

            </div>

            <div>Note: I have precisely the same problem with newgvn's

              predicate handling:</div>

            <div><br>

            </div>

            <div>I need to be able to tell, for a given instruction,

              what predicates dominate it (including assumes)</div>

            <div><br>

            </div>

            <div>It's pretty fast for most predicates, because they

              occur at the end of blocks, but could still be sped up.</div>

            <div>For assumes, not so much, because they can occur

              wherever.</div>

            <div><br>

            </div>

            <div>I have a prototype where i've implemented e-ssa to

              solve both problems. It requires the introduction of one

              intrinsic, and some use renaming.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    Where do we introduce the intrinsic? Do we maintain it throughout

    the pipeline?<br>

    <br>

    <blockquote

cite="mid:CAF4BwTUmHL=1TW5XQ16yTWCSCU=f2WtWZ1yZwv=Y4Hro9qPLPQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>IE</div>

            <div>if (foo == 50) {</div>

            <div>} else { }</div>

            <div><br>

            </div>

            <div>becomes</div>

            <div><br>

            </div>

            <div>if (foo == 50) {</div>

            <div>foo1 = predicateinfo(foo, the icmp of foo )</div>

            <div>(use foo1)</div>

            <div>} else {</div>

            <div>foo2 = predicateinfo(foo, the icmp of foo)</div>

            <div>(use foo2)</div>

            <div>}<br>

            </div>

            <br>

            <div>Once that's done, finding the set of predicates that

              usefully affect a given use is trivial, as you can just

              walk the def-use chains.</div>

            <div><br>

            </div>

            <div>I haven't implemented it for assume, but it should work

              just as well for that.</div>

            <div><br>

            </div>

            <div>I'm somewhat against caching solutions and hacky here,I

              generally prefer to solve infrastructury problems with

              better infrastructure.  Caching tends to just kick the can

              down the road, and often makes it harder to build better

              infrastructure.  It makes sense when the change is

              significantly harder to get right, and not worth the cost.

              Here, honestly, we can do better. </div>

            <div>We have plenty of cases where CVP, etc, will be

              significantly sped up by the above (i can hand them to

              you) but not by your change.</div>

            <div><br>

            </div>

            <div>Note also that it improves a large set of passes.</div>

            <div>For example, the variable split enables SCCP to do

              correlated value propagation (trivially for equality

              comparison, other things with a bit of work)</div>

            <div><br>

            </div>

            <div>Before, it could only do something if "foo" turned out

              the same everywhere.</div>

            <div>After, you can propagate  "foo == 50" to the foo1

              branch.</div>

            <div><br>

              Early measurements come out to a few percent more

              constants found total.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    How will we know when to add the predicateinfo intrinsic, or will we

    do it for all values used in the conditionally-executed region? This

    patch did the equivalent thing by pattern-matching the condition.<br>

    <br>

    Generally, I'm in favor of going in this direction.<br>

    <br>

    For assumes, we'd still need to pattern match, right? If so, this is

    still a cache in a sense ;) -- I suspect we'd still need the

    assumption cache too in order to efficiently compute ephemeral

    values (so that the inline cost analysis, etc. still have the right

    complexity bounds).<br>

    <br>

    Thanks again,<br>

    Hal<br>

     <br>

    <blockquote

cite="mid:CAF4BwTUmHL=1TW5XQ16yTWCSCU=f2WtWZ1yZwv=Y4Hro9qPLPQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div><br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </body>

</html>