<div dir="ltr"><div>Is the issue that nothing knows that n isn't 0? In which case the loop would run close to 2^64-1 iterations. Is the -2 just a failure to print 2^64 - 2 as a positive number?</div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Aug 16, 2018 at 4:54 PM Alexandre Isoard via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">The loop exits iff 

{2,+,1}<nuw><%for.body> == 

(zext i32 %n to i64)<div><br></div><div>The nuw marking on the "induction variable" should be sufficient to deduce a max loop trip count of 2^32.</div><div>But I do not know how we compute it (we build a database and it is contrived to follow, at least to me).</div><div><br></div><div>I saw that we annotate it with <nsw> (which is correct and can be (and probably has been) deduced from the ranges) and following our discussion, we can't annotate it with <nuw> as per limitation of our unification.</div><div><br></div><div>So, I am kind of in a bind here...</div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Aug 16, 2018 at 4:34 PM Friedman, Eli <<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <div class="m_1410296790029585731m_-645509329432923076moz-cite-prefix">In general, the backedge-taken count is
      an unsigned value; it's possible to write a loop with a trip count
      of 2^64 using a 64-bit induction variable.  To prove your loop has
      a "small" trip count, you have to use either the guard or the
      nsw/nuw markings on the induction variable.<br>
      <br>
      -Eli<br>
      <br>
      On 8/16/2018 4:09 PM, Alexandre Isoard wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">Ok.
        <div><br>
        </div>
        <div>To go back to the original issue, would it be meaningful to
          add a SCEVUMax(0, BTC) on the final BTC computed by SCEV?</div>
        <div><br>
        </div>
        <div>So that it does not use "negative values"?</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr">On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli
          <<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div text="#000000" bgcolor="#FFFFFF">
            <div class="m_1410296790029585731m_-645509329432923076m_6547919799618152574moz-cite-prefix">On
              8/15/2018 2:27 PM, Alexandre Isoard wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">I'm not sure I understand the
                poison/undef/UB distinctions.
                <div><br>
                </div>
                <div>But on this example:</div>
                <div><br>
                </div>
                <div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">define i32
                    @func(i1 zeroext %b, i32 %x, i32 %y) {<br>
                    entry:<br>
                      %adds = add nsw i32 %x, %y<br>
                      %addu = add nuw i32 %x, %y<br>
                      %cond = select i1 %b, i32 %adds, i32 %addu<br>
                      ret i32 %cond<br>
                    }</blockquote>
                </div>
                <div><br>
                </div>
                <div>It is important to not propagate the nsw/nuw
                  between the two SCEV expressions (which unification
                  would do today, can I consider that a bug or is it a
                  feature?).</div>
              </div>
            </blockquote>
            <br>
            It's an intentional design choice.<br>
            <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div>So we work-around it by not informing SCEV of the
                  flags:</div>
                <div><br>
                </div>
                <div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Printing analysis
                    'Scalar Evolution Analysis' for function 'func':<br>
                    Classifying expressions for: @func<br>
                      %adds = add nsw i32 %x, %y<br>
                      -->  (%x + %y) U: full-set S: full-set<br>
                      %addu = add nuw i32 %x, %y<br>
                      -->  (%x + %y) U: full-set S: full-set<br>
                      %cond = select i1 %b, i32 %adds, i32 %addu<br>
                      -->  %cond U: full-set S: full-set<br>
                    Determining loop execution counts for: @func</blockquote>
                </div>
                <div><br>
                </div>
                <div>Would there be problems if we properly considered
                  nuw/nsw flags when unifying SCEVs?</div>
              </div>
            </blockquote>
            <br>
            There would be other consequences.  For example, `(%x +
            %y)<nsw>` and `(%x + %y)<nuw>` wouldn't compare
            equal for other simplifications, and all the places that
            call setNoWrapFlags would have to be rewritten.  It's
            probably possible to come up with some workable design, but
            nobody has actually tried it, so it's not clear how much
            work it would be to implement, or whether it would improve
            the generated code overall.<br>
            <br>
            -Eli<br>
            <br>
            <blockquote type="cite"><br>
              <div class="gmail_quote">
                <div dir="ltr">On Wed, Aug 15, 2018 at 1:59 PM Friedman,
                  Eli <<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div text="#000000" bgcolor="#FFFFFF">
                    <div class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531moz-cite-prefix">On
                      8/15/2018 1:31 PM, Alexandre Isoard wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">Is that why we do not deduce
                        +<nsw> from "add nsw" either?</div>
                    </blockquote>
                    <br>
                    Essentially, yes.<br>
                    <br>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div>Is that an intrinsic limitation of creating
                          a context-invariant expressions from a Value*
                          or is that a limitation of our implementation
                          (our unification not considering the nsw
                          flags)?</div>
                      </div>
                    </blockquote>
                    <br>
                    It's a consequence of unification not considering
                    nsw.  (nsw on an instruction is naturally invariant
                    because violating nsw produces poison, not UB.)<br>
                    <br>
                    -Eli<br>
                    <br>
                    <blockquote type="cite"><br>
                      <div class="gmail_quote">
                        <div dir="ltr">On Wed, Aug 15, 2018 at 12:39 PM
                          Friedman, Eli <<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>>
                          wrote:<br>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531m_-6181515349790066455m_-6236410721431341120moz-cite-prefix">On
                              8/15/2018 12:21 PM, Alexandre Isoard via
                              llvm-dev wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="ltr">Hello,
                                <div><br>
                                </div>
                                <div>If I run clang on the following
                                  code:</div>
                                <div><br>
                                </div>
                                <div>
                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">void
                                    func(unsigned n) {<br>
                                      for (unsigned long x = 1; x <
                                    n; ++x)<br>
                                        dummy(x);<br>
                                    }</blockquote>
                                  <div><br>
                                  </div>
                                  <div>I get the following llvm ir:</div>
                                  <div><br>
                                  </div>
                                  <div>
                                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">define
                                      void @func(i32 %n) {<br>
                                      entry:<br>
                                        %conv = zext i32 %n to i64<br>
                                        %cmp5 = icmp ugt i32 %n, 1<br>
                                        br i1 %cmp5, label %for.body,
                                      label %for.cond.cleanup<br>
                                      for.cond.cleanup:                 
                                                     ; preds =
                                      %for.body, %entry<br>
                                        ret void<br>
                                      for.body:                         
                                                     ; preds = %entry,
                                      %for.body<br>
                                        %x.06 = phi i64 [ %inc,
                                      %for.body ], [ 1, %entry ]<br>
                                        tail call void @dummy(i64 %x.06)
                                      #2<br>
                                        %inc = add nuw nsw i64 %x.06, 1<br>
                                        %exitcond = icmp eq i64 %inc,
                                      %conv<br>
                                        br i1 %exitcond, label
                                      %for.cond.cleanup, label %for.body<br>
                                      }</blockquote>
                                  </div>
                                  <div><br>
                                  </div>
                                  <div>Over which, SCEV will provide the
                                    following analysis:</div>
                                  <div><br>
                                  </div>
                                  <div>
                                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Printing
                                      analysis 'Scalar Evolution
                                      Analysis' for function 'func':<br>
                                      Classifying expressions for: @func<br>
                                        %conv = zext i32 %n to i64<br>
                                        -->  (zext i32 %n to i64) U:
                                      [0,4294967296) S: [0,4294967296)<br>
                                        %x.06 = phi i64 [ %inc,
                                      %for.body ], [ 1, %entry ]<br>
                                        --> 
                                      {1,+,1}<nuw><nsw><%for.body>
                                      U: [1,-9223372036854775808) S:
                                      [1,-9223372036854775808)<span style="white-space:pre-wrap">               </span>Exits:
                                      (-1 + (zext i32 %n to i64))<span style="white-space:pre-wrap">            </span>LoopDispositions:
                                      { %for.body: Computable }<br>
                                        %inc = add nuw nsw i64 %x.06, 1<br>
                                        --> 
                                      {2,+,1}<nuw><%for.body>
                                      U: [2,0) S: [2,0)<span style="white-space:pre-wrap">              </span>Exits:
                                      (zext i32 %n to i64)<span style="white-space:pre-wrap">           </span>LoopDispositions:
                                      { %for.body: Computable }<br>
                                      Determining loop execution counts
                                      for: @func<br>
                                      Loop %for.body: backedge-taken
                                      count is (-2 + (zext i32 %n to
                                      i64))<nsw><br>
                                      Loop %for.body: max backedge-taken
                                      count is -2<br>
                                      Loop %for.body: Predicated
                                      backedge-taken count is (-2 +
                                      (zext i32 %n to i64))<nsw><br>
                                       Predicates:<br>
                                      Loop %for.body: Trip multiple is 1</blockquote>
                                  </div>
                                  <div><br>
                                  </div>
                                  <div>Now, I was surprised by the max
                                    backedge-taken count being -2, and I
                                    suspect it is due to the
                                    backedge-taken count being marked as
                                    <nsw> instead of <nuw>.</div>
                                  <div><br>
                                  </div>
                                  <div>Is that on purpose, is that a
                                    bug, or is my analysis incorrect? I
                                    am not sure where to fix that issue.</div>
                                </div>
                              </div>
                            </blockquote>
                            <br>
                            The backedge-taken count isn't nuw because
                            nsw/nuw markings aren't flow-sensitive:
                            there isn't any way to mark the trip count
                            as nuw without marking every computation of
                            `(long)n-2` as nuw.<br>
                            <br>
                            There's some code in
                            ScalarEvolution::howFarToZero to try to
                            refine the max backedge-taken count in some
                            cases, but it isn't very general.  See <a class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531m_-6181515349790066455m_-6236410721431341120moz-txt-link-freetext" href="https://reviews.llvm.org/D28536" target="_blank">https://reviews.llvm.org/D28536</a>
                            .<br>
                            <br>
                            -Eli<br>
                            <pre class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531m_-6181515349790066455m_-6236410721431341120moz-signature" cols="72">-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
                          </div>
                        </blockquote>
                      </div>
                      <br clear="all">
                      <div><br>
                      </div>
                      -- <br>
                      <div dir="ltr" class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531m_-6181515349790066455gmail_signature" data-smartmail="gmail_signature">
                        <div dir="ltr"><b>Alexandre Isoard</b><br>
                        </div>
                      </div>
                    </blockquote>
                    <p><br>
                    </p>
                    <pre class="m_1410296790029585731m_-645509329432923076m_6547919799618152574m_-5001341794010270531moz-signature" cols="72">-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
                  </div>
                </blockquote>
              </div>
              <br clear="all">
              <div><br>
              </div>
              -- <br>
              <div dir="ltr" class="m_1410296790029585731m_-645509329432923076m_6547919799618152574gmail_signature" data-smartmail="gmail_signature">
                <div dir="ltr"><b>Alexandre Isoard</b><br>
                </div>
              </div>
            </blockquote>
            <p><br>
            </p>
            <pre class="m_1410296790029585731m_-645509329432923076m_6547919799618152574moz-signature" cols="72">-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
          </div>
        </blockquote>
      </div>
      <br clear="all">
      <div><br>
      </div>
      -- <br>
      <div dir="ltr" class="m_1410296790029585731m_-645509329432923076gmail_signature" data-smartmail="gmail_signature">
        <div dir="ltr"><b>Alexandre Isoard</b><br>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <pre class="m_1410296790029585731m_-645509329432923076moz-signature" cols="72">-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
  </div>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="m_1410296790029585731gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><b>Alexandre Isoard</b><br></div></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>