[PATCH] D17080: [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 12 16:10:23 PDT 2017


Ayal added a comment.

This looks ok to me, added only minor comments, but it should be approved by someone who understands all this better...

IIUC the current scheme originally computes CanDoRT and NeedRTCheck "independently", or rather intertwinedly, checking first if CanDoRT without Assume in parallel to determining if NeedRTCheck, and if the latter is true revisit if CanDoRT with Assume as needed by Retries. A simpler albeit possibly slower approach is to first determine separately if NeedRTCheck, returning true if not; else check if CanDoRT with Assume. But Need[RT]Check seems to depend on CanDo[AliasSet]RT, and we may return false if !NeedRTCheck when there are incomparable pointers of distinct address spaces. The use of "isDependencyCheckNeeded()" also raises an eyebrow, or two.



================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:643
   int64_t Stride = getPtrStride(PSE, Ptr, L, Strides);
-  return Stride == 1;
+  if (Stride == 1 || PSE.hasNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW))
+    return true;
----------------
The tests added below check this other added condition, namely PSE.hasNoOverflow(), right?


================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:656
+                                          bool Assume) {
+  bool IsDepCheckNeeded = isDependencyCheckNeeded();
+  Value *Ptr = Access.getPointer();
----------------
IsDepCheckNeeded was originally taken out of the AS : AST loop.
Here you can simply ask "if(isDependencyCheckNeeded())" later.


================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:658
+  Value *Ptr = Access.getPointer();
+  bool IsWrite = Access.getInt();
+
----------------
IsWrite is needed only later by RtCheck.insert(), can move it there.


================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:748
+      NeedsCheck = (NumWritePtrChecks >= 2 ||
+                    (NumReadPtrChecks >= 1 && NumWritePtrChecks >= 1));
+
----------------
This goes back to the original code:
it may be slightly easier to always bump ++NumTotalPtrChecks instead of ++NumReadPtrChecks and set

```
NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1);
```


But not sure I follow the original logic here; it's equivalent to doing:

```
bool NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1);
if (IsDepCheckNeeded && CanDoAliasSetRT && RunningDepId == 2)
  NeedsCheck = false;
```

Having CanDoAliasSetRT && RunningDepId == 2 implies that NumTotalPtrChecks == 2, but why set NeedsCheck to false when this happens and IsDepCheckNeeded? What if NumWritePtrChecks >= 1?


May want to rename NeedsCheck >> NeedsAliasSetRTCheck, analogous to the CanDo's.


================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:792
         DEBUG(dbgs() << "LAA: Runtime check would require comparison between"
                        " different address spaces\n");
         return false;
----------------
ok, but what if !NeedRTCheck?


================
Comment at: test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll:5
+
+; When len has a type larger than i, i can oveflow in the following kernel:
+;
----------------
sbaranga wrote:
> Ayal wrote:
> > "i can ove[r]flow": better clarify which i can overflow - in the test below the induction variable of the loop i is a signed 64bit (as is the bound 'len') whose bump has an nsw, so it is free of overflow concerns. The indices of A[i] and A[i+1] are (signed or unsigned?) 32bit idx and idx.inc, whose zext-add-trunc bump has no nuw nor nsw and are therefore subject to overflow concerns. Is it clear why the Added "SCEV assumptions" Flags for these should be <nssw>, rather than <nusw>? Starting from 0 and 1, nssw is more conservative.
> > 
> > "When len has a type": these assumptions are needed regardless of the type of len.
> Thanks for the observations. There is a discrepancy here between the C code and the IR being tested (and the text is misleading). We are actually looking at the expressions for %inc.ptr0 and %inc.ptr1.
> 
> The sign extension comes from the getelementptr instruction. Because the SCEV expression that we are trying to linearize is something like (sext i32 {0,+,1}<%for.body> to i64), we can only add the nssw flag. 
> 
> I'll update the text to clarify things.
Thanks! This looks fine to me.


================
Comment at: test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll:24
+;  i64 {(4 + %a),+,4}<%for.body>
+;  i64 {(4 + %a),+,4}<%for.body>
+
----------------
the second one should be:
;  i64 {(%b),+,4}<%for.body>


================
Comment at: test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll:75
+
+; i can oveflow in the following kernel:
+; void test2(unsigned long long x, int *a) {
----------------
ove[r]flow


https://reviews.llvm.org/D17080





More information about the llvm-commits mailing list