[llvm-dev] valid BasicAA behavior?

Thu Mar 26 08:52:02 PDT 2020

To recap:

1. The result returned by BasicAA is correct in respect to the
AliasAnalysis specification.
2. We agree that the result return by DependenceInfo is incorrect,
including a pre-existing bug report.

I am puzzled how you conclude that we should fix BasicAA.

However, you are free to upload a patch to Phabricator so we get
experience about the consequences of your proposed change.

Michael

Am Do., 19. März 2020 um 02:04 Uhr schrieb Chawla, Pankaj
<pankaj.chawla at intel.com>:
>
> Hi Michael,
>
> Thank you for the thoughtful reply and discussion!
>
> >> I don't think we can get any meaningful dependence when the base pointer changes between loop iterations. In your case, it's done by a PHINode, but it could also be loaded from memory each iteration (and some metadata indicates that two loaded pointers does not alias IN THE SAME ITERATION).
>
> >>  for (int i = 0; i < S->size; ++i) {
> >>    S->Data[i] = S->Data[i-1];
> >>    do_something(S, ...);
> >>  }
>
> The loop invariance issue caused by PHIs is fundamentally different than the one caused my memory accesses. The latter is much easier for DI to deal with. AA will return MayAlias in this case which is the conservatively correct answer. After that it is up to DI to form a conservative direction vector by checking whether the pointer is loop invariant.
>
> >> Either AA or DI must ensure the invariance of underlying objects. IMHO it would make more sense if DI did it. AA is typically control-flow agnostics and loop-invariance out of its scope.
>
> Most of AA is control-flow agnostic but BasicAA is not. In fact, in the pointer swapping case aliasPHI() is refining the result based on control flow by querying corresponding PHI operands.
>
> Here's an example where the base ptr PHIs are not loop-invariant but the NoAlias result returned by BasicAA for p[i] and q[i] is still valid for DI-
>
> int A[100];
> int B[100];
> int C[100];
> int D[100];
>
> void foo() {
>
>   int i, *p, *q;
>   for(i=0; i<50; i++) {
>     if (i > 5) {
>       p = A;
>       q = C;
>     }
>     else {
>       p = B;
>       q = D;
>     }
>     p[i] = q[i];
>   }
> }
>
> The basic problem is that in some cases BasicAA is not returning the conservatively correct answer for purposes of dependence analysis.
> Not all control-flow based analysis is bad, just the one which doesn't correctly take into account loops/cycles in the CFG.
>
> For correctness, what we want is a more conservative cousin of BasicAA which DI can rely upon. The workaround for UnknownSize I proposed was for precisely this reason.
> All analysis which care about 'loop-carried' analysis pass in UnknownSize to alias analysis.
>
> Having said that, a separate alias interface which has conservatively correct behavior in the presence of loops/cycles would be a much better solution.
>
> -Pankaj
>
> -----Original Message-----
> From: Michael Kruse <llvmdev at meinersbur.de>
> Sent: Wednesday, March 18, 2020 6:02 PM
> To: Chawla, Pankaj <pankaj.chawla at intel.com>
> Cc: Michael Kruse <llvmdev at meinersbur.de>; Kruse, Michael <michael.kruse at anl.gov>; Finkel, Hal J. <hfinkel at anl.gov>; Hiroshi Yamauchi <yamauchi at google.com>; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] valid BasicAA behavior?
>
> Am Mi., 18. März 2020 um 18:15 Uhr schrieb Chawla, Pankaj
> <pankaj.chawla at intel.com>:
> >
> > >> DependenceInfo is not using the AA interface correctly. Either DI has to be fixed, or another method added to AA that gives additional guarantees. Please see the bug report for details.
> >
> > Thanks for updating the bug report but GetUnderlyingObject() doesn't help in this case. The underlying object of the phi is phi itself. I don't think any amount of change on the client side will help because AA interface doesn't give the guarantee DI needs.
>
> For a meaningful analysis, the underling object must be the same in the analysis' scop. In case of DependenceInfo, this translates into loop-invariance of the base base pointer. I don't think we can get any meaningful dependence when the base pointer changes between loop iterations. In your case, it's done by a PHINode, but it could also be loaded from memory each iteration (and some metadata indicates that two loaded pointers does not alias IN THE SAME ITERATION).
>
> DI tries to approximate this:
>
> switch (underlyingObjectsAlias(AA, F->getParent()->getDataLayout(),
>                                  MemoryLocation::get(Dst),
>                                  MemoryLocation::get(Src))) {
>   case MayAlias:
>   case PartialAlias:
>     // cannot analyse objects if we don't understand their aliasing.
>     LLVM_DEBUG(dbgs() << "can't analyze may or partial alias\n");
>     return std::make_unique<Dependence>(Src, Dst);
>   case NoAlias:
>     // If the objects noalias, they are distinct, accesses are independent.
>     LLVM_DEBUG(dbgs() << "no alias\n");
>     return nullptr;
>   case MustAlias:
>     break; // The underlying objects alias; test accesses for dependence.
>   }
>
> Unfortunately, this does not verify that the base pointer is invariant. It ensures aliasing within one iterations, but it it is obviously wrong to conclude that pointers from different iterations also do/do not alias.
>
> I haven't looked at all of DependenceInfo, so there might be some other place where the invariance is checked. One problem we had to face for ensuring correctness of Polly was to ensure the the loop-invariance of base pointers. A typical pattern is the following:
>
>   for (int i = 0; i < S->size; ++i) {
>      S->Data[i] = S->Data[i-1];
>      do_something(S, ...);
>   }
>
> This looks like a simple flow dependence [i -> i+1]; The IR generated by clang re-loads the pointer S->Data again in ever iteration, but do_something might actually change the value of S->Data when called.
> That is, we cannot analyze the dependency in this loop unless the load of S->Data is hoisted out of the loop (e.g. by LICM). Polly has mechisms to either infer conditions under which S->Data is loop-invariant or to bail out if it cannot.
>
>
> > >> As far as the AA interface specification is concerned, the NoAlias result is correct. Such a patch would be a pessimization without giving any additional guarantees.
> >
> > Ok, fair point.
> > So the only option left is to introduce a new AA interface which does provide the guarantees needed by DI, correct?
>
> Either AA or DI must ensure the invariance of underlying objects. IMHO it would make more sense if DI did it. AA is typically control-flow agnostics and loop-invariance out of its scope.
>
> Michael