[llvm] [SCEV] Consolidate code for proving wrap flags of controlling finite IVs (PR #101404)

Wed Jul 31 13:19:01 PDT 2024

https://github.com/preames created https://github.com/llvm/llvm-project/pull/101404

The canAssumeNoSelfWrap routine in howManyLessThans was doing two subtly inter-related things.  First, it was proving no-self-wrap.  This exactly duplicates the existing logic in the caller.  Second, it was establishing the precondition for the nw->nsw/nuw inference. Specifically, we need to know that *this* exit must be taken for the inference to be sound.  Otherwise, another (possible abnormal) exit could be taken in the iteration where this IV would become poison.

This change moves all of that logic into the caller, and caches the resulting nuw/nsw flags in the AddRec.  This centralizes the logic in one place, and makes it clear that it all depends on controlling the sole exit.

This change is mostly NFC, but we do loose a couple cases with SCEV predication. Specifically, if SCEV predication was able to convert e.g. zext(addrec) into an addrec(zext) using predication, but didn't record the nuw fact on the new addrec, then the consuming code can no longer fix this up. I don't think this case particularly matters; curious if reviewers agree.

>From c494305d4427fa8e8a3efd29a831858cf455f1d3 Mon Sep 17 00:00:00 2001
From: Philip Reames <preames at rivosinc.com>
Date: Wed, 31 Jul 2024 10:38:43 -0700
Subject: [PATCH] [SCEV] Consolidate code for proving wrap flags of controlling
 finite IVs

The canAssumeNoSelfWrap routine in howManyLessThans was doing two
subtly inter-related things.  First, it was proving no-self-wrap.  This
exactly duplicates the existing logic in the caller.  Second, it was
establishing the precondition for the nw->nsw/nuw inference.
Specifically, we need to know that *this* exit must be taken for the
inference to be sound.  Otherwise, another (possible abnormal) exit
could be taken in the iteration where this IV would become poison.

This change moves all of that logic into the caller, and caches the
resulting nuw/nsw flags in the AddRec.  This centralizes the logic
in one place, and makes it clear that it all depends on controlling
the sole exit.

This change is mostly NFC, but we do loose a couple cases with SCEV
predication. Specifically, if SCEV predication was able to convert
e.g. zext(addrec) into an addrec(zext) using predication, but didn't
record the nuw fact on the new addrec, then the consuming code can
no longer fix this up. I don't think this case particularly matters;
curious if reviewers agree.
---
 llvm/lib/Analysis/ScalarEvolution.cpp         | 74 +++++++------------
 .../trip-count-implied-addrec.ll              |  6 --
 2 files changed, 28 insertions(+), 52 deletions(-)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index fb56d5d436653..acfadfa3e7d5f 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -9146,11 +9146,13 @@ ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromICmp(
       }
 
   // If this loop must exit based on this condition (or execute undefined
-  // behaviour), and we can prove the test sequence produced must repeat
-  // the same values on self-wrap of the IV, then we can infer that IV
-  // doesn't self wrap because if it did, we'd have an infinite (undefined)
-  // loop.
+  // behaviour), see if we can improve wrap flags.  This is essentially
+  // a must execute style proof.
   if (ControllingFiniteLoop && isLoopInvariant(RHS, L)) {
+    // If we can prove the test sequence produced must repeat
+    // the same values on self-wrap of the IV, then we can infer that IV
+    // doesn't self wrap because if it did, we'd have an infinite (undefined)
+    // loop.
     // TODO: We can peel off any functions which are invertible *in L*.  Loop
     // invariant terms are effectively constants for our purposes here.
     auto *InnerLHS = LHS;
@@ -9167,6 +9169,25 @@ ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromICmp(
         setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
       }
     }
+
+    // For a slt/ult condition with a positive step, can we prove nsw/nuw?
+    // From no-self-wrap, this follows trivially from the fact that every
+    // (un)signed-wrapped, but not self-wrapped value must be LT than the
+    // last value before (un)signed wrap.  Since we know that last value
+    // didn't exit, nor will any smaller one.
+    if (Pred == ICmpInst::ICMP_SLT || Pred == ICmpInst::ICMP_ULT) {
+      auto WrapType = Pred == ICmpInst::ICMP_SLT ? SCEV::FlagNSW : SCEV::FlagNUW;
+      if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(LHS);
+          AR && AR->getLoop() == L && AR->isAffine() &&
+          !AR->getNoWrapFlags(WrapType) && AR->hasNoSelfWrap() &&
+          isKnownPositive(AR->getStepRecurrence(*this))) {
+        auto Flags = AR->getNoWrapFlags();
+        Flags = setFlags(Flags, WrapType);
+        SmallVector<const SCEV*> Operands{AR->operands()};
+        Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
+        setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
+      }
+    }
   }
 
   switch (Pred) {
@@ -12757,34 +12778,6 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
   const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);
   bool PredicatedIV = false;
 
-  auto canAssumeNoSelfWrap = [&](const SCEVAddRecExpr *AR) {
-    // Can we prove this loop *must* be UB if overflow of IV occurs?
-    // Reasoning goes as follows:
-    // * Suppose the IV did self wrap.
-    // * If Stride evenly divides the iteration space, then once wrap
-    //   occurs, the loop must revisit the same values.
-    // * We know that RHS is invariant, and that none of those values
-    //   caused this exit to be taken previously.  Thus, this exit is
-    //   dynamically dead.
-    // * If this is the sole exit, then a dead exit implies the loop
-    //   must be infinite if there are no abnormal exits.
-    // * If the loop were infinite, then it must either not be mustprogress
-    //   or have side effects. Otherwise, it must be UB.
-    // * It can't (by assumption), be UB so we have contradicted our
-    //   premise and can conclude the IV did not in fact self-wrap.
-    if (!isLoopInvariant(RHS, L))
-      return false;
-
-    auto *StrideC = dyn_cast<SCEVConstant>(AR->getStepRecurrence(*this));
-    if (!StrideC || !StrideC->getAPInt().isPowerOf2())
-      return false;
-
-    if (!ControlsOnlyExit || !loopHasNoAbnormalExits(L))
-      return false;
-
-    return loopIsFiniteByAssumption(L);
-  };
-
   if (!IV) {
     if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS)) {
       const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());
@@ -12935,21 +12928,10 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
         Stride = getUMaxExpr(Stride, getOne(Stride->getType()));
       }
     }
-  } else if (!Stride->isOne() && !NoWrap) {
-    auto isUBOnWrap = [&]() {
-      // From no-self-wrap, we need to then prove no-(un)signed-wrap.  This
-      // follows trivially from the fact that every (un)signed-wrapped, but
-      // not self-wrapped value must be LT than the last value before
-      // (un)signed wrap.  Since we know that last value didn't exit, nor
-      // will any smaller one.
-      return canAssumeNoSelfWrap(IV);
-    };
-
+  } else if (!NoWrap) {
     // Avoid proven overflow cases: this will ensure that the backedge taken
-    // count will not generate any unsigned overflow. Relaxed no-overflow
-    // conditions exploit NoWrapFlags, allowing to optimize in presence of
-    // undefined behaviors like the case of C language.
-    if (canIVOverflowOnLT(RHS, Stride, IsSigned) && !isUBOnWrap())
+    // count will not generate any unsigned overflow. .
+    if (canIVOverflowOnLT(RHS, Stride, IsSigned))
       return getCouldNotCompute();
   }
 
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll
index 64306ac28cf27..b313842ad5e1a 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll
@@ -239,12 +239,6 @@ define void @neg_rhs_wrong_range(i16 %n.raw) mustprogress {
 ; CHECK-NEXT:  Loop %for.body: Unpredictable backedge-taken count.
 ; CHECK-NEXT:  Loop %for.body: Unpredictable constant max backedge-taken count.
 ; CHECK-NEXT:  Loop %for.body: Unpredictable symbolic max backedge-taken count.
-; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is ((-1 + (2 umax (-1 + (zext i8 (trunc i16 %n.raw to i8) to i16))<nsw>)) /u 2)
-; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:      {2,+,2}<nw><%for.body> Added Flags: <nusw>
-; CHECK-NEXT:  Loop %for.body: Predicated symbolic max backedge-taken count is ((-1 + (2 umax (-1 + (zext i8 (trunc i16 %n.raw to i8) to i16))<nsw>)) /u 2)
-; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:      {2,+,2}<nw><%for.body> Added Flags: <nusw>
 ;
 entry:
   %n.and = and i16 %n.raw, 255