[PATCH] D44564: [BasicAA] Use PhiValuesAnalysis if available when handling phi alias

John Brawn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 20 07:37:03 PDT 2018


john.brawn added a comment.

In https://reviews.llvm.org/D44564#1162264, @hfinkel wrote:

> We have a check in aliasSameBasePointerGEPs to avoid PR32314 (which I'll glibly summarize as: we need to be careful about looking through PHIs so that we don't, without realizing it, end up comparing values from different loop iterations). The fact that this looks back through an unknown number of PHIs makes me concerned that we'll get into a similar situation.
>
> I apologize for ignoring this review for so long, but I think that this is really important. Even with the PR32314 check currently there, I suspect that the fundamental algorithm for looking back through PHIs in BasicAA is still not sound, and we need to be careful going forward. As such, I'd like to see this patch updated with additional comments explaining:
>
> 1. Can we get back values from different loop iterations from PhiValues.
> 2. If so, is there special care that must be taken in handling them.
> 3. Are we currently doing what's necessary to handle them correctly.
>
>   Thanks!


I've been looking looking around basicaa for a while and how it's handling this kind of thing, and I think the short answer here is that this patch means we're doing more of the same thing that we're already doing so it doesn't introduce any new bugs but could possibly make already-existing bugs (if there are any) reveal themselves in more situations, but if so the fix would be elsewhere not here.

The long answer (which I've written mostly to make sure I have my own thoughts in order, so possibly not especially coherent):

The general structure of BasicAAResult::aliasCheck, the main alias-checking function, seems to be:

- See if, by looking at V1 and V2 as atomic things, we can determine if V1 and V2 definitely point to either the same or different things.
- If not, and V1 or V2 is a gep, phi, or select, try breaking up V1 or V2 into component parts and comparing it with the other.
- If that doesn't give us MustAlias or NoAlias, go check the other alias analyses.

What we have to be careful about then is that we don't have a situation where we break up V1 into individual components, see that there's no alias with V2, then conclude that there's no alias between V1 and V2 as a whole but actually there is.

Going and looking at PR32314 specifically, a simplified version of that is

  loop:
   %phi1 = phi [ undef, %entry ], [ %gep2, %loop ]
   %phi2 = phi [ 1, %entry ], [ %inc, %loop ]
   %inc = add %phi2, 1
   %sub = add %phi2, -1
   %gep1 = getelementptr %arr, %sub
   %gep2 = getelementptr %arr, %phi2
   store %foo, %gep1
   %bar = load %phi1

The problem here is that for the same value of %phi2, %gep1 and %gep2 point to different memory locations, but when comparing %phi1 and %gep1 we can't just say "given that %phi's only value is %gep2, we can check if %phi1 and %gep1 alias by checking if %gep1 and %gep2 alias" because the value of %phi2 in the context of the value of %gep2 on the loop back-edge is different to the value of %phi2 in the %gep1 not on that back-edge.

The question then is: is it the responsibility of the phi node handling when it's doing the recursive aliasCheck to do something special to handle this, or is it the responsibility of aliasCheck to realise that the same sub-expression in V1 and V2 may have different values? Currently, it looks like it's the responsibility of aliasCheck, or at least the fix for PR32314 was done by making a specific part of aliasSameBasePointerGEPs aware that the same phi in a subexpression of the indices of two geps could have different values.

So, getting back to this patch, what's changed? Instead of just looking at the direct incoming values of a phi, and giving up if one of those is a phi, we get the total set of non-phi values gotten by traversing the phi tree. From the perspective of avoiding path-sensitive value difference there's no change to before: it's up to the recursive aliasCheck call to get things right, and doing things differently in this way doesn't fundamentally change things as far as I can tell: we either get it right, or we get it wrong, and if we get it wrong there would have been some other situation where originally we would have got it wrong as well.

So to answer your specific questions:

> Can we get back values from different loop iterations from PhiValues.

Yes, but that's no change to how things were before.

> If so, is there special care that must be taken in handling them.

Nothing that we didn't already need to do.

> Are we currently doing what's necessary to handle them correctly.

I assume so, but the necessary handling would be in the recursive aliasCheck call, not here.


https://reviews.llvm.org/D44564





More information about the llvm-commits mailing list