[llvm] [DA] do not handle array accesses of different offsets (PR #123436)

Michael Kruse via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 10 07:51:00 PDT 2025


================
@@ -3569,6 +3569,123 @@ bool DependenceInfo::invalidate(Function &F, const PreservedAnalyses &PA,
          Inv.invalidate<LoopAnalysis>(F, PA);
 }
 
+// Check that memory access offsets in V are multiples of array element size
+// EltSize. Param records the first parametric expression. If the scalar
+// evolution V contains two or more parameters, we check that the subsequent
+// parametric expressions are multiples of the first parametric expression
+// Param.
+static bool checkOffsets(ScalarEvolution *SE, const SCEV *V, const SCEV *&Param,
+                         uint64_t EltSize) {
+  if (auto *AddRec = dyn_cast<SCEVAddRecExpr>(V)) {
+    if (!checkOffsets(SE, AddRec->getStart(), Param, EltSize))
+      return false;
+    return checkOffsets(SE, AddRec->getStepRecurrence(*SE), Param, EltSize);
+  }
+  if (auto *Cst = dyn_cast<SCEVConstant>(V)) {
+    APInt C = Cst->getAPInt();
+
+    // For example, alias_with_different_offsets in
+    // test/Analysis/DependenceAnalysis/DifferentOffsets.ll accesses "%A + 2":
+    //   %arrayidx = getelementptr inbounds i8, ptr %A, i64 2
+    //   store i32 42, ptr %arrayidx, align 1
+    // which is writing an i32, i.e., EltSize = 4 bytes, with an offset C = 2.
+    // checkOffsets returns false, as the offset C=2 is not a multiple of 4.
+    return C.srem(EltSize) == 0;
+  }
+
+  // Use a lambda helper function to check V for parametric expressions.
+  // Param records the first parametric expression. If the scalar evolution V
+  // contains two or more parameters, we check that the subsequent parametric
+  // expressions are multiples of the first parametric expression Param.
+  auto checkParamsMultipleOfSize = [&](const SCEV *V,
+                                       const SCEV *&Param) -> bool {
+    if (EltSize == 1)
+      return true;
+    if (!Param) {
+      Param = V;
+      return true;
----------------
Meinersbur wrote:

> > Some index expressions may also use %n + %n, %n + 1, or no %n at all in which case they may have different relative offsets already.
> 
> Correct. This patch tries to prove at compile time that if an array offset is using %n, then all other array offsets will use %n in multiples of the array element size.

But does not check for the same offsets, is checks for the same parameter.

For instance,
```c
*(int*)((char*)BasePtr)
*(int*)((char*)BasePtr + n)
```
The first one should pass trivially by https://github.com/sebpop/llvm-project/blob/c7b5c98204ab580443f45d148994bc93827b5258/llvm/lib/Analysis/ScalarEvolution.cpp#L10992

The second should pass by https://github.com/sebpop/llvm-project/blob/c7b5c98204ab580443f45d148994bc93827b5258/llvm/lib/Analysis/ScalarEvolution.cpp#L11013 because a Parm has not been set yet. However, there is nothing that ensures "`n` is a multiple of `sizeof(int)`", hence a partial/unaligned access relative to the first.


> > If %n (or any other subexpression) is invariant to all loops and occurs in every index, I think it could be considered part of the base pointer.
> 
> Correct.

But currently it is not?!? The base pointer is determined by `ScalarEvoluition::getPointerBase` which does not consider/know what it is considered to be a parameter.

https://github.com/llvm/llvm-project/pull/123436


More information about the llvm-commits mailing list