[llvm] [Loads] Support dereference for non-constant offset (PR #149551)

Sun Aug 10 05:00:36 PDT 2025

================
@@ -361,29 +361,29 @@ bool llvm::isDereferenceableAndAlignedInLoop(
     AccessSize = MaxPtrDiff;
     AccessSizeSCEV = PtrDiff;
   } else if (auto *MinAdd = dyn_cast<SCEVAddExpr>(AccessStart)) {
-    if (MinAdd->getNumOperands() != 2)
-      return false;
+    const auto *NewBase = dyn_cast<SCEVUnknown>(SE.getPointerBase(MinAdd));
+    const auto *OffsetSCEV = SE.removePointerBase(MinAdd);
 
-    const auto *Offset = dyn_cast<SCEVConstant>(MinAdd->getOperand(0));
-    const auto *NewBase = dyn_cast<SCEVUnknown>(MinAdd->getOperand(1));
-    if (!Offset || !NewBase)
+    if (!OffsetSCEV || !NewBase)
       return false;
 
-    // The following code below assumes the offset is unsigned, but GEP
-    // offsets are treated as signed so we can end up with a signed value
-    // here too. For example, suppose the initial PHI value is (i8 255),
-    // the offset will be treated as (i8 -1) and sign-extended to (i64 -1).
-    if (Offset->getAPInt().isNegative())
+    if (!SE.isKnownNonNegative(OffsetSCEV))
       return false;
 
     // For the moment, restrict ourselves to the case where the offset is a
     // multiple of the requested alignment and the base is aligned.
     // TODO: generalize if a case found which warrants
-    if (Offset->getAPInt().urem(Alignment.value()) != 0)
+    auto *OffsetSCEVTy = OffsetSCEV->getType();
+    if (!SE.isKnownPredicate(
+            ICmpInst::ICMP_EQ,
+            SE.getURemExpr(OffsetSCEV,
+                           SE.getConstant(OffsetSCEVTy, Alignment.value())),
+            SE.getZero(OffsetSCEVTy)))
       return false;
-
-    AccessSize = MaxPtrDiff + Offset->getAPInt();
-    AccessSizeSCEV = SE.getAddExpr(PtrDiff, Offset);
+    AccessSizeSCEV = SE.getAddExpr(PtrDiff, OffsetSCEV);
+    const auto *Offset = dyn_cast<SCEVConstant>(OffsetSCEV);
+    AccessSize = MaxPtrDiff + (Offset ? Offset->getAPInt()
+                                      : SE.getUnsignedRangeMax(OffsetSCEV));
----------------
annamthomas wrote:

Unfortunately, it doesn't. I tried isKnownPredicateAt with Loop's predecessor's terminator as CtxI as well (in case assumes help).

```
define void @deref_assumption_loop_access_start_variable(i8 %v, ptr noundef %P, i64 range(i64 0, 2000) %N, ptr noalias %b, ptr noalias %c, i64 range(i64 0, 2000) %iv.start) nofree nosync {
  entry:
    %a = getelementptr i8, ptr %P, i64 16
    %cmp = icmp slt i64 %iv.start, %N
    call void @llvm.assume(i1 %cmp)
    %mul = mul i64 %N, 4
    %add = add i64 %mul, 16
    call void @llvm.assume(i1 true) [ "dereferenceable"(ptr %P, i64 %add) ]
    br label %loop
    ...
  ```
NewBase is %P. AccessStart is `(16 + (4 * %iv.start)<nuw><nsw> + %P). Note the nuw/nsw flags on the  "4 * %iv.start". This is present  because of the range on iv.start. 

>From the above we should get that the offset computation itself does not wrap: should be known from the range on %iv.start. 
i.e. (16 + (4 * %iv.start)<nuw><nsw>)<nuw> 

Isn't that enough for `AccessStart` `uge` `NewBase`.  But something is missing in the logic because we don't identify the range of this arithmetic operation given the range of iv.start is `(0, 2000)`


https://github.com/llvm/llvm-project/pull/149551