<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Sep 3, 2014 at 7:33 PM, Andrew Trick <span dir="ltr"><<a href="mailto:atrick@apple.com" target="_blank">atrick@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I'll say LGTM because fixing a bug is an improvement. This is unfortunate though because LFTR does not care about overflow and we don't want to pessimize code just because SCEV was able to infer NSW. The problem is that we can't express that NSW applies to some uses an not others.<br>

<br>

The fix is also not totally robust because SCEV could, in theory, drop NSW flags even though they exist in IR. The induction variable was not necessarily produced by SCEVExpander. I think making it totally robust would involve getting a new expression for CmpIndVar by applying non-NSW AddExpr's (similar to genLoopLimit). That is of course much more complexity, hence risk, so maybe best as a FIXME.<br></blockquote><div><br></div><div>genLoopLimit assumes that the IndVar has a SCEV, when could the induction variable not be produced by SCEVExpander?</div><div> </div><div>I quickly hacked the following together, it seems to restore the old behavior in at least one case.  It tries to see if signed or unsigned overflow can occur by looking at the initial value and the number of backedges.</div><div><br></div><div><div>diff --git a/lib/Transforms/Scalar/IndVarSimplify.cpp b/lib/Transforms/Scalar/IndVarSimplify.cpp</div><div>index 27e5ec8..ccc906d 100644</div><div>--- a/lib/Transforms/Scalar/IndVarSimplify.cpp</div><div>+++ b/lib/Transforms/Scalar/IndVarSimplify.cpp</div><div>@@ -1655,8 +1655,31 @@ LinearFunctionTestReplace(Loop *L,</div><div>     // FIXME: In theory, SCEV could drop flags even though they exist in IR.</div><div>     // A more robust solution would involve getting a new expression for</div><div>     // CmpIndVar by applying non-NSW/NUW AddExprs.</div><div>+    auto WrappingFlags =</div><div>+        ScalarEvolution::setFlags(SCEV::FlagNUW, SCEV::FlagNSW);</div><div>+    const SCEV *IVInit = IncrementedIndvarSCEV->getStart();</div><div>+    if (SE->getTypeSizeInBits(IVInit->getType()) ></div><div>+        SE->getTypeSizeInBits(IVCount->getType()))</div><div>+      IVInit = SE->getTruncateExpr(IVInit, IVCount->getType());</div><div>+    if (const auto *IVInitCst = dyn_cast<SCEVConstant>(IVInit)) {</div><div>+      if (const auto *BackedgeTakenCountCst =</div><div>+              dyn_cast<SCEVConstant>(BackedgeTakenCount)) {</div><div>+        const APInt &IVInitVal = IVInitCst->getValue()->getValue();</div><div>+        APInt BackedgeTakenCountVal =</div><div>+            BackedgeTakenCountCst->getValue()->getValue();</div><div>+        bool SignedOverflow, UnsignedOverflow;</div><div>+        IVInitVal.uadd_ov(BackedgeTakenCountVal, UnsignedOverflow);</div><div>+        IVInitVal.sadd_ov(BackedgeTakenCountVal, SignedOverflow);</div><div>+        if (!UnsignedOverflow)</div><div>+          WrappingFlags =</div><div>+              ScalarEvolution::clearFlags(WrappingFlags, SCEV::FlagNUW);</div><div>+        if (!SignedOverflow)</div><div>+          WrappingFlags =</div><div>+              ScalarEvolution::clearFlags(WrappingFlags, SCEV::FlagNSW);</div><div>+      }</div><div>+    }</div><div>     if (!ScalarEvolution::maskFlags(IncrementedIndvarSCEV->getNoWrapFlags(),</div><div>-                                    SCEV::FlagNUW | SCEV::FlagNSW)) {</div><div>+                                    WrappingFlags)) {</div><div>       // Add one to the "backedge-taken" count to get the trip count.</div><div>       // This addition may overflow, which is valid as long as the comparison is</div><div>       // truncated to BackedgeTakenCount->getType().</div></div><div><br></div><div>Does it seem like a reasonable approach?</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br>

Out of curiosity, can you explain how we came to generate bad code for this case (so I don't have to debug the test case)? Was there an optimization after LFTR that assumed NSW at the compare? Do we end up widening the loop test and failing to exit?<br>

<br>

FWIW: The check against FlagAnyWrap was a little confusing to me, as opposed to simply:<br>

if (maskFlags(Expr->getNoWrapFlags(), SCEV::NSW | SCEV::NUW))<br>

<div class=""><div class="h5"><br>

REPOSITORY<br>

  rL LLVM<br>

<br>

<a href="http://reviews.llvm.org/D5174" target="_blank">http://reviews.llvm.org/D5174</a><br>

<br>

<br>

</div></div></blockquote></div><br></div></div>