[llvm] [Loads] Support dereference for non-constant offset (PR #149551)

Fri Aug 22 08:22:38 PDT 2025

================
@@ -361,29 +361,29 @@ bool llvm::isDereferenceableAndAlignedInLoop(
     AccessSize = MaxPtrDiff;
     AccessSizeSCEV = PtrDiff;
   } else if (auto *MinAdd = dyn_cast<SCEVAddExpr>(AccessStart)) {
-    if (MinAdd->getNumOperands() != 2)
-      return false;
+    const auto *NewBase = dyn_cast<SCEVUnknown>(SE.getPointerBase(MinAdd));
+    const auto *OffsetSCEV = SE.removePointerBase(MinAdd);
 
-    const auto *Offset = dyn_cast<SCEVConstant>(MinAdd->getOperand(0));
-    const auto *NewBase = dyn_cast<SCEVUnknown>(MinAdd->getOperand(1));
-    if (!Offset || !NewBase)
+    if (!OffsetSCEV || !NewBase)
       return false;
 
-    // The following code below assumes the offset is unsigned, but GEP
-    // offsets are treated as signed so we can end up with a signed value
-    // here too. For example, suppose the initial PHI value is (i8 255),
-    // the offset will be treated as (i8 -1) and sign-extended to (i64 -1).
-    if (Offset->getAPInt().isNegative())
+    if (!SE.isKnownNonNegative(OffsetSCEV))
       return false;
 
     // For the moment, restrict ourselves to the case where the offset is a
     // multiple of the requested alignment and the base is aligned.
     // TODO: generalize if a case found which warrants
-    if (Offset->getAPInt().urem(Alignment.value()) != 0)
+    auto *OffsetSCEVTy = OffsetSCEV->getType();
+    if (!SE.isKnownPredicate(
+            ICmpInst::ICMP_EQ,
+            SE.getURemExpr(OffsetSCEV,
+                           SE.getConstant(OffsetSCEVTy, Alignment.value())),
+            SE.getZero(OffsetSCEVTy)))
       return false;
-
-    AccessSize = MaxPtrDiff + Offset->getAPInt();
-    AccessSizeSCEV = SE.getAddExpr(PtrDiff, Offset);
+    AccessSizeSCEV = SE.getAddExpr(PtrDiff, OffsetSCEV);
+    const auto *Offset = dyn_cast<SCEVConstant>(OffsetSCEV);
+    AccessSize = MaxPtrDiff + (Offset ? Offset->getAPInt()
+                                      : SE.getUnsignedRangeMax(OffsetSCEV));
----------------
annamthomas wrote:

Okay, looking at all the usages for `getStartAndEndForAccess`, there are no checks for startAccess overflowing in the caller (see `areAccessesCompletelyBeforeOrAfter` API in LoopAccessAnalysis.cpp). It looks like it is already handled before that point and there is a comment in the code stating so:
```
        // Evaluating AR at an exact BTC is safe:  LAA separately checks that
        // accesses cannot wrap in the loop. If evaluating AR at BTC wraps, then
        // the loop either triggers UB when executing a memory access with a
        // poison pointer or the wrapping/poisoned pointer is not used.
```

The original code before this patch also did not have the overflow check explicitly added here.
i.e. the returned value for `getStartAndEndForAccess` means the AR does not wrap at `Start` and neither at `End`. 

There are some limitations in SCEV to identify `isKnownPredicate(uge, AccessStart, NewBase)`, but I think it is orthogonal to this patch. 

https://github.com/llvm/llvm-project/pull/149551