<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 09/14/2017 10:43 AM, Daniel Neilson

      wrote:<br>

    </div>

    <blockquote cite="mid:4312CF89-E38D-45E6-9BE6-5AB153884A8F@azul.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <div class="">Thank you for your thoughts, Hal. More information

        below...</div>

      <div class="">

        <div style="word-wrap: break-word; -webkit-nbsp-mode: space;

          -webkit-line-break: after-white-space;" class="">

          <div style="color: rgb(0, 0, 0); font-family: Helvetica;

            font-size: 12px; font-style: normal; font-variant-caps:

            normal; font-weight: normal; letter-spacing: normal;

            text-align: start; text-indent: 0px; text-transform: none;

            white-space: normal; word-spacing: 0px;

            -webkit-text-stroke-width: 0px;" class="">

            <br class="">

          </div>

        </div>

      </div>

      <div>

        <blockquote type="cite" class="">

          <div class="">On Sep 13, 2017, at 5:43 PM, Hal Finkel <<a

              moz-do-not-send="true" href="mailto:hfinkel@anl.gov"

              class="">hfinkel@anl.gov</a>> wrote:</div>

          <div class="">

            <div bgcolor="#FFFFFF" text="#000000" class="">

              <div class="moz-cite-prefix">On 09/13/2017 01:01 PM,

                Daniel Neilson via llvm-dev wrote:<br class="">

              </div>

            </div>

          </div>

        </blockquote>

        <div><br class="">

        </div>

        <div>… snip</div>

        <div><br class="">

        </div>

        <blockquote type="cite" class="">

          <div bgcolor="#FFFFFF" text="#000000" class="">

            <blockquote

              cite="mid:0E250E9D-303A-4C3C-93E0-0473B97DD17F@azul.com"

              type="cite" class="">

              <div class=""> For example, the following IR will produce

                different sets of IV users if either:</div>

              <div class="">i) The order of the PHI nodes in the %loop

                block are reordered; or</div>

              <div class="">ii) The uselistorder directive is

                uncommented</div>

              <div class=""><br class="">

              </div>

              <div class="">---</div>

              <div class="">target datalayout =

                "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"<br class="">

                target triple = "x86_64-unknown-linux-gnu"<br class="">

                <br class="">

                define void @test(i64 %v1, i32 %v2, i64* %addr) {<br

                  class="">

                entry:<br class="">

                 br label %loop<br class="">

                 <br class="">

                loop:<br class="">

                 %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]<br

                  class="">

                 %iv2 = phi i32 [%v2, %entry], [%5, %loop]<br class="">

                 %0 = trunc i64 %iv to i32<br class="">

                 %1 = sub i32 %iv2, %0<br class="">

                 %2 = sitofp i32 %1 to double<br class="">

                 %3 = sub i64 0, %iv<br class="">

                 %4 = trunc i64 %3 to i32<br class="">

                 %5 = sub i32 %1, %4<br class="">

                 %iv.inc = add i64 %iv, 1<br class="">

                 store i64 %iv.inc, i64* %addr, align 8<br class="">

                 br i1 undef, label %loop, label %exit<br class="">

                 <br class="">

                exit:<br class="">

                 ret void<br class="">

                <br class="">

                ; uselistorder directives ----<br class="">

                ; uselistorder i64 %iv, {2, 1, 0}<br class="">

                }<br class="">

                —</div>

            </blockquote>

          </div>

        </blockquote>

        <div class="">

          <div bgcolor="#FFFFFF" text="#000000" class=""><br class="">

          </div>

        </div>

        <div bgcolor="#FFFFFF" text="#000000" class="">… snip</div>

        <div bgcolor="#FFFFFF" text="#000000" class=""><br class="">

        </div>

        <blockquote type="cite" class="">

          <div class="">

            <div bgcolor="#FFFFFF" text="#000000" class="">

              <blockquote

                cite="mid:0E250E9D-303A-4C3C-93E0-0473B97DD17F@azul.com"

                type="cite" class="">

                <div class=""> So, to get back to the original

                  questions:</div>

                <div class="">1) What exactly is IVUsers supposed to be

                  finding? For instance, in the example above, what

                  would be the ideal/correct set of IVUsers that the

                  analysis should be finding?</div>

              </blockquote>

              <br class="">

              I don't have a good answer to this question, other than

              the obvious one (that it's supposed to find all values

              that have interesting SCEV expressions making use of the

              induction variable). Interesting here means that it's an

              affine addrec or has a starting value that is one

              (recursively). It also helps support post-inc

              transformations.<br class="">

              <br class="">

              The problem here is that we're trying to avoid calling

              getSCEV on all instructions just to see if they end up

              being an addrec of the given loop. Maybe we should do this

              in two steps? First, walk the users to find the

              instruction on which to call getSCEV. Then, go through the

              instructions in BB order, calling getSCEV on those

              identified instructions.<br class="">

              <br class="">

            </div>

          </div>

        </blockquote>

        <div><br class="">

        </div>

        <div>I’m not sure, either. What it looks like, just from reading

          and interpreting the implementation, is that it’s looking for

          the SCEVs of instructions that terminate a def-use chain that

          starts at a loop-header phi. There are are a few criteria for

          what constitutes “terminating” a def-use chain, but the most

          fundamental one (I think) appears to be that a user of the

          SCEV isn’t itself an “interesting” SCEV. </div>

        <br class="">

        <blockquote type="cite" class="">

          <div class="">

            <div bgcolor="#FFFFFF" text="#000000" class="">

              <blockquote

                cite="mid:0E250E9D-303A-4C3C-93E0-0473B97DD17F@azul.com"

                type="cite" class="">

                <div class=""><br class="">

                </div>

                <div class="">2) Is it acceptable that there is this

                  sort of difference in the IVUsers analysis results

                  based on nothing more than instruction or use list

                  ordering? I personally hope not; it has been mildly

                  infuriating having to narrow down on a bug with this

                  difference in place.</div>

              </blockquote>

              <br class="">

              Technically speaking, the dependency on the order of the

              phis is okay (i.e., it's possible they'll be no good way

              to avoid that). Not having it is clearly better. The

              dependency on the use-list ordering is highly discouraged.

              As you've noticed, this makes problems very hard to track

              down.<br class="">

              <br class="">

              Part of the problem here may be that, because what SCEV

              proves, and thus returns, is dependent on what SCEV's have

              been previously constructed (unfortunately), it's not hard

              to develop these kinds of processing-order dependencies

              with analyses that use SCEV.<br class="">

              <br class="">

               -Hal<br class="">

              <br class="">

            </div>

          </div>

        </blockquote>

        <div><br class="">

        </div>

        <div> Thankfully, it looks like the SCEVs that are produced for

          each instruction in the DFT that I’m looking at are

          consistent; there doesn’t seem to be any affect on the SCEVs

          themselves as a result of the traversal orders. </div>

        <div><br class="">

        </div>

        <div> It looks to me that there are two pieces of the

          implementation of IVUsers that are leading to the fragility

          (i.e. dependence on input ordering) that I am seeing.</div>

        <div>A) The inclusion of Processed.count(User) as a condition of

          the two if statements at approx lines 235 & 241

          in IVUsers::AddUsersImpl() of lib/Analysis/IVUsers.cpp.</div>

        <div>B) It doesn’t terminate a DFT if it encounters a phi node

          in the loop-header that it hasn’t seen before.</div>

        <div><br class="">

        </div>

        <div> To help illustrate these, here are the relevant DFT traces

          from the sample IR that I provided in the original post. (1)

          is the IR as-is, and (2) is with the uselistorder instruction

          active.</div>

        <div><br class="">

        </div>

        <div>DFT (1) </div>

        <div>%iv = phi [%v1], [%iv.inc]  (ret true)</div>

        <div>SCEV: {%v1,+,1}<%loop><br class="">

             * User: %iv.inc = add %iv, 1 (ret true)<br class="">

                     SCEV: {(1 + %v1),+,1}<%loop><br class="">

                * User: store %iv.inc, %addr  (ret false) **** ADD as

          user of %iv.inc def ****<br class="">

                * User: %iv = phi [%v1], [%iv.inc] (PHI already

          processed)<br class="">

             *  User: %3 = sub 0, %iv (ret true)<br class="">

                      SCEV: {(-1 * %v1),+,-1}<%loop><br class="">

                * User: %4 = trunc %3 (ret true)<br class="">

                        SCEV: {(trunc i64 (-1 * %v1) to

          i32),+,-1}<%loop><br class="">

                   * User: %5 = sub %1, %4 (ret true)<br class="">

                           SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32))

          + (-1 * (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1

          * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop><br

            class="">

                      * User: %iv2 = phi [%v2], [%5] (ret true)<br

            class="">

                              SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1)

          to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop><br

            class="">

                         * User: %1 = sub %iv2, %0 (ret true)<br

            class="">

                                 SCEV: {((-1 * (trunc i64 %v1 to i32)) +

          %v2),+,(-1 + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *

          (trunc i64 %v1 to i32)))}<%loop><br class="">

                            * User: %5 = sub %1, %4 (already processed)

          **** ADD as user of %1 def ****<br class="">

                            * User: %2 = sitofp %1 (ret false) **** ADD

          as user of %1 def ****<br class="">

             * User: %0 = trunc %iv (ret true)<br class="">

                     SCEV: {(trunc i64 %v1 to i32),+,1}<%loop><br

            class="">

                * User: %1 = sub %iv2, %0 (already processed) **** ADD

          as user of %0 def ****<br class="">

          <br class="">

          IV Users for loop %loop:<br class="">

            %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64

          %iv.inc, i64* %addr, align 8<br class="">

            %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 *

          (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to

          i32)))}<%loop> in    %5 = sub i32 %1, %4<br class="">

            %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 *

          (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to

          i32)))}<%loop> in    %2 = sitofp i32 %1 to double<br

            class="">

            %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 =

          sub i32 %iv2, %0<br class="">

          <br class="">

          DFT (2)</div>

        <div>%iv = phi [%v1], [%iv.inc] (ret true)<br class="">

          SCEV: {%v1,+,1}<%loop><br class="">

             * User: %0 = trunc %iv (ret true)<br class="">

                     SCEV: {(trunc i64 %v1 to i32),+,1}<%loop><br

            class="">

                * User: %1 = sub %iv2, %0 (ret true)<br class="">

                        SCEV: {((-1 * (trunc i64 %v1 to i32)) +

          %v2),+,(-1 + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *

          (trunc i64 %v1 to i32)))}<%loop><br class="">

                   * User: %5 = sub %1, %4 (ret true)<br class="">

                           SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32))

          + (-1 * (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1

          * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop><br

            class="">

                      * User: %iv2 = phi [%v2], [%5] (ret true)<br

            class="">

                              SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1)

          to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop><br

            class="">

                         * User: %1 = sub %iv2, %0 (already processed)

          **** ADD as user of %iv2 def ****<br class="">

                   * User: %2 = sitofp %1 (ret false) **** ADD as user

          of %1 def ****<br class="">

             * User: %3 = sub 0, %iv (ret true)<br class="">

                     SCEV: {(-1 * %v1),+,-1}<%loop><br class="">

                * User: %4 = trunc %3 (ret true)<br class="">

                        SCEV: {(trunc i64 (-1 * %v1) to

          i32),+,-1}<%loop><br class="">

                   * User: %5 = sub %1, %4 (already processed) **** ADD

          as user of %4 def ****<br class="">

             * User: %iv.inc = add %iv, 1 (ret true)<br class="">

                     SCEV: {(1 + %v1),+,1}<%loop><br class="">

                * User: store %iv.inc, %addr (ret false) **** ADD as

          user of %iv.inc ****<br class="">

                * User: %iv = phi [%v1], [%iv.inc] (PHI already

          processed)<br class="">

          <br class="">

          IV Users for loop %loop:<br class="">

            %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *

          (trunc i64 %v1 to i32)))}<%loop> in    %1 = sub i32

          %iv2, %0<br class="">

            %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 *

          (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to

          i32)))}<%loop> in    %2 = sitofp i32 %1 to double<br

            class="">

            %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in  

           %5 = sub i32 %1, %4<br class="">

            %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64

          %iv.inc, i64* %addr, align 8</div>

        <blockquote type="cite" class="">

          <div class="">

          </div>

        </blockquote>

      </div>

      <div class=""><br class="">

      </div>

      <div class=""><br class="">

      </div>

      <div class=""> For point (A), first take a close look at what the

        DFT does when it encounters the def of %1 as a User in DFT (1).

        The first time that %1’s def is encountered is as a user of

        %iv2. We continue traversing through the def of %1 at this point

        to process the users of it because we haven’t encountered the

        def of %1 yet (i.e. Processed.count(%1) is 0). However, we

        encounter the def of %1 again in this same DFT later as a user

        of the def of %0. This second time that we encounter %1 we don’t

        traverse its users because we’ve seen %1 before (i.e.

        Processed.count(%1) is 1), and so we add the SCEV of %0 to the

        IVUsers set. In DFT (2) we encounter the def of %1 first as a

        user of %0 — which allows the DFT to process the users of %1

        instead of adding the SCEV of %0 to the IVUsers set.</div>

      <div class=""><br class="">

      </div>

      <div class=""> For point (B), take a look at the traversal order

        in DFT (1). You’ll see the chain %iv -> %3 -> %4 -> %5

        -> %iv2 -> %1 -> %5. The def of %5 appears twice in the

        same def-use chain, and the second time we see it will be the

        same situation as in point (A). The reason that we revisit %5 in

        the same def-use chain is that we allow the DFT to continue

        through %iv2 (%iv2 is a phi node in the loop-header) that will,

        in effect, allow the DFT to loop-around and revisit. The result

        is adding the def of %5 as a user of %1’s SCEV in DFT (1). We

        don’t do the same in DFT (2) because we don’t visit the def of

        %5 in the same way. Instead, in DFT (2) the def of %1 ends up

        being the value that appears twice in a def-use chain.</div>

      <div class=""><br class="">

      </div>

      <div class=""> What I *think* would be proper for IVUsers is to

        have IVUsers::AddUserImpl() :</div>

      <div class=""><br class="">

      </div>

      <div class="">A) Memoize its result for each given instruction

        instead of having the Processed.count(User) condition at lines

        235 & 241. The presence of Processed.count(User) in the

        conditions at lines 235 & 241 basically has the effect of

        changing the return value of AddUsersImpl() for “interesting”

        instructions from true to false, which results in different

        behaviour depending on the order in which instructions are

        visited. Interestingly, if we don’t memoize the result of

        AddUsersImpl() at all, but instead just remove the

        Processed.count(User) parts of these two if statements, then the

        return value of AddUsersImpl() will *always* be true and we

        would have consistent results for recursive calls on

        “interesting” SCEVS, but inconsistent results for recursive

        calls on instructions that aren’t interesting.</div>

      <div class=""><br class="">

      </div>

      <div class="">B) Similar to the condition at line 212, don’t let

        the DFT continue into Users that are phi nodes in the

        loop-header. We’re going to visit every phi node in the

        loop-header as the root of a DFT, in turn, anyways, so this just

        prevents the possibility of revisiting the same instruction

        multiple times in the same def-use chain.</div>

    </blockquote>

    <br>

    These seems sensible.<br>

    <br>

    <blockquote cite="mid:4312CF89-E38D-45E6-9BE6-5AB153884A8F@azul.com"

      type="cite">

      <div class=""><br class="">

      </div>

      <div class=""> If I mock these two changes up, the IVUsers sets in

        all three of my original situations (IR as-is, IR with

        header-phi’s rearranged, and IR with use-list ordering) all

        produce the exact same set of IVUsers. However, the set of

        IVUsers ends up being a subset of what we previously had. I

        don’t know if this is correct because the IVUsers analysis

        doesn’t appear to be well-defined with respect to what it should

        be generating. Specifically, with this test we end up with

        slight variations of this DFT:</div>

      <div class=""><br class="">

      </div>

      <div class="">%iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]<br

          class="">

        SCEV: {%v1,+,1}<%loop><br class="">

           * User: %iv.inc = add i64 %iv, 1 (ret true)<br class="">

                   SCEV: {(1 + %v1),+,1}<%loop><br class="">

              * User: store i64 %iv.inc, i64* %addr, align 8. (ret

        false) **** ADD as user of %iv.inc def ****<br class="">

              * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]

        (already processed PHI, skip)<br class="">

           * User: %3 = sub i64 0, %iv (ret true)<br class="">

                   SCEV: {(-1 * %v1),+,-1}<%loop><br class="">

              * User: %4 = trunc i64 %3 to i32 (ret true)<br class="">

                      SCEV: {(trunc i64 (-1 * %v1) to

        i32),+,-1}<%loop><br class="">

                 * User: %5 = sub i32 %1, %4 (ret true)<br class="">

                         SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) +

        (-1 * (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 *

        %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop><br

          class="">

                    * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop

        ] (PHI in loop header, skip)<br class="">

           * User: %0 = trunc i64 %iv to i32 (ret true)<br class="">

                   SCEV: {(trunc i64 %v1 to i32),+,1}<%loop><br

          class="">

              * User: %1 = sub i32 %iv2, %0<br class="">

                      SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1

        + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to

        i32)))}<%loop><br class="">

                 * User: %5 = sub i32 %1, %4 (memoized - ret true)<br

          class="">

                 * User: %2 = sitofp i32 %1 to double (ret false) ****

        ADD as user of %1 def ****<br class="">

        %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ]<br class="">

        SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *

        (trunc i64 %v1 to i32)))}<%loop><br class="">

           * User: %1 = sub i32 %iv2, %0 (memoized - ret true)<br

          class="">

        <br class="">

        IVUsers:<br class="">

          %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64

        %iv.inc, i64* %addr, align 8<br class="">

          %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 *

        (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to

        i32)))}<%loop> in    %2 = sitofp i32 %1 to double<br

          class="">

        <br class="">

      </div>

      <div class=""><br class="">

      </div>

      <div class=""> This looks sensible to me, but, again, I don’t

        really have a good sense of what the “proper” IVUsers results

        are expected to be.</div>

    </blockquote>

    <br>

    Compared to the previous results, this is missing:<br>

    <br>

      %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32

    %iv2, %0<br>

      %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in    %5 =

    sub i32 %1, %4<br>

      %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc

    i64 %v1 to i32)))}<%loop> in    %1 = sub i32 %iv2, %0<br>

    <br>

    Can you comment on why?<br>

    <br>

    Thanks again,<br>

    Hal<br>

    <br>

    <blockquote cite="mid:4312CF89-E38D-45E6-9BE6-5AB153884A8F@azul.com"

      type="cite">

      <div class=""><br class="">

      </div>

      <div class=""><br class="">

      </div>

      <div class=""> Thoughts anyone?</div>

      <div class=""><br class="">

      </div>

      <div class="">Thanks,</div>

      <div class=""> Daniel</div>

      <br class="">

      <div class="">

        <div>---</div>

        <div>Daniel Neilson, Ph.D.</div>

        <div>Azul Systems</div>

      </div>

      <div class=""><br class="">

      </div>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </body>

</html>