[llvm] [LV] Use SCEV to check if IV overflow check is known (PR #115705)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 11 23:17:42 PST 2024


================
@@ -2491,17 +2491,19 @@ void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) {
     Value *LHS = Builder.CreateSub(MaxUIntTripCount, Count);
 
     Value *Step = CreateStep();
-#ifndef NDEBUG
     ScalarEvolution &SE = *PSE.getSE();
     const SCEV *TC2OverflowSCEV = SE.applyLoopGuards(SE.getSCEV(LHS), OrigLoop);
-    assert(
-        !isIndvarOverflowCheckKnownFalse(Cost, VF * UF) &&
-        !SE.isKnownPredicate(CmpInst::getInversePredicate(ICmpInst::ICMP_ULT),
-                             TC2OverflowSCEV, SE.getSCEV(Step)) &&
-        "unexpectedly proved overflow check to be known");
-#endif
-    // Don't execute the vector loop if (UMax - n) < (VF * UF).
-    CheckMinIters = Builder.CreateICmp(ICmpInst::ICMP_ULT, LHS, Step);
+    const SCEV *StepSCEV = SE.getSCEV(Step);
+
+    // Check if (UMax - n) < (VF * UF).
+    if (SE.isKnownPredicate(ICmpInst::ICMP_ULT, TC2OverflowSCEV, StepSCEV)) {
----------------
lukel97 wrote:

I took a look at teaching `isIndvarOverflowCheckKnownFalse` but as @fhahn said I think it's already doing the correct thing. `getConstantMaxBackedgeTakenCount` was returning 0xFFFFFFFFFFFFFFFF due to the overflow case at `%tc == 0`. But `getConstantTripCount` throws it away because it only considers trip counts that fit into 32 bits. 

The SCEV assertion uses the full 64 bits of the TC type though, and in combination with what I think is a separate coincidental issue that `(UMax - %tc) < (VF * UF)` when `%tc == 0` means, it means that the predicate is always true.

I couldn't think of an easy way of updating the assertion, so I've reworked this PR to just remove the assert and nothing else for now.

I've added some new cases to show things that I think could be split off from this PR:

- `@overflow_at_0` shows how the overflow check seems to let `%tc == 0` slip through, which maybe is something that needs fixed in a separate PR?
- `@no_overflow_at_0` shows a case where `isIndvarOverflowCheckKnownFalse` works and knows that the maximum trip count is 1025, so nothing needs to be done here
- `@trip_count_max_1024` shows a case where we currently can't calculate the max trip count, i.e. the debug output doesn't contain:`LV: Found maximum trip count: N`, and so `isIndvarOverflowCheckKnownFalse` also returns false. I think in a separate PR we can teach `computeMaxBECountForLT` to apply the loop guards, which seems to fix this.


https://github.com/llvm/llvm-project/pull/115705


More information about the llvm-commits mailing list