[PATCH] D108699: [LAA] Analyze pointers forked by a select

Sun Apr 10 14:59:12 PDT 2022

fhahn added a comment.

Thanks for the update!

I think it would be good to split this up to have a first patch that just adds very limited support in `findForkedSCEVs` and then gradually add support for more cases separately. This also makes it easier to make sure all code paths in `findForkedSCEVs` are covered by the unit tests.

I went ahead and rebased D114487 <https://reviews.llvm.org/D114487> and stripped the fork analysis to the bare minimum. I'd be happy to land the scaffolding in D114487 <https://reviews.llvm.org/D114487> separately and this patch and following could add the sophisticated fork analysis.

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:132
+static cl::opt<unsigned> MaxForkedSCEVDepth(
+    "max-forked-scev-depth", cl::Hidden,
+    cl::desc("Maximum recursion depth when finding forked SCEVs (default = 5)"),
----------------
It would be good to add a test for the option, e.g. a case that requires 2 or 3 recursions and set test with `max-forked-scev-depth=2/3`

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:378

-  DenseMap<Value *, unsigned> PositionMap;
-  for (unsigned Index = 0; Index < Pointers.size(); ++Index)
-    PositionMap[Pointers[Index].PointerValue] = Index;
+  DenseMap<Value *, SmallVector<unsigned, 2>> PositionMap;
+  for (unsigned Index = 0; Index < Pointers.size(); ++Index) {
----------------
This scheme would need documenting, i.e. why we can have multiple expressions for a pointer.

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:383
+  }
+  // PositionMap[Pointers[Index].PointerValue] = Index;

----------------
should be removed?

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:643
+                                const SCEV *PtrScev, Loop *L, bool Assume) {
+  //  const SCEV *PtrScev = replaceSymbolicStrideSCEV(PSE, Strides, Ptr);

----------------
should be removed?

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:748
+  case Instruction::BitCast:
+    findForkedSCEVs(SE, L, I->getOperand(0), ScevList, Depth);
+    break;
----------------
missing test for this?

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:766
+    // Not dealing with preexisting gathers yet, so no vectors.
+    if (I->getNumOperands() != 2 || SourceTy->isVectorTy()) {
+      ScevList.push_back(Scev);
----------------
It looks like tests for those conditions are missing?

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:779
+    Type *IntPtrTy = SE->getEffectiveSCEVType(BaseExpr->getType());
+    SCEV::NoWrapFlags Wrap =
+        GEP->isInBounds() ? SCEV::FlagNSW : SCEV::FlagAnyWrap;
----------------
I don't think we can use the inbounds info here, unless we prove that the program is undefined if GEP is poison.

Consider something like below. If `%c` is always false, the GEP index could be out-of-bounds (and the GEP poison). Adding a runtime check based on the SCEV expression may introduce a branch on poison unconditionally.

```
define dso_local void @forked_ptrs_different_base_same_offset(float* nocapture readonly %Base1, float* nocapture readonly %Base2, float* nocapture %Dest, i32* nocapture readonly %Preds, i1 %c) {
entry:
  br label %for.body

for.cond.cleanup:
  ret void

for.body:
  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %latch ]
  %arrayidx = getelementptr inbounds i32, i32* %Preds, i64 %indvars.iv
  %0 = load i32, i32* %arrayidx, align 4
  %cmp1.not = icmp eq i32 %0, 0
  %spec.select = select i1 %cmp1.not, float* %Base2, float* %Base1
  %.sink.in = getelementptr inbounds float, float* %spec.select, i64 %indvars.iv
  %.sink = load float, float* %.sink.in, align 4
  %1 = getelementptr inbounds float, float* %Dest, i64 %indvars.iv
  br i1 %c, label %then, label %latch

then:
  store float %.sink, float* %1, align 4
  br label %latch

latch:
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond.not = icmp eq i64 %indvars.iv.next, 100
  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
}
```

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:786
+
+    if (OffsetScevs.size() == 2 && BaseScevs.size() == 1) {
+      const SCEV *Off1 = SE->getTruncateOrSignExtend(OffsetScevs[0], IntPtrTy);
----------------
It might be a bit simpler to just add a duplicate to BaseScevs/OffsetScevs and have one common code path to compute the SCEVs to add

================
Comment at: llvm/lib/Analysis/LoopAccessAnalysis.cpp:901
+    }
+    // Is this needed? I think the method above forced it, so can be skipped?
+    if (TranslatedPtrs.size() == 1)
----------------
Needs resolving.

It should be needed, because the above code may have added assumptions, which make Ptr an AddRec, See comment in D114487.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108699/new/

https://reviews.llvm.org/D108699