[PATCH] D38662: [BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)

Wed Oct 11 12:15:40 PDT 2017

efriedma added inline comments.

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:519
+      // relevant values of V, (C2*Scale) can overflow. In that case, we cannot
+      // decompose the expression in this way.
+      APInt WideScaledOffset = IndexOffset.sextOrTrunc(MaxPointerSize*2) *
----------------
hfinkel wrote:
> efriedma wrote:
> > hfinkel wrote:
> > > efriedma wrote:
> > > > I'm not sure I understand this comment.
> > > > 
> > > > It's true that C2*Scale can overflow, but I'm not sure it makes sense to try to address that here.  The multiply by the scale can overflow no matter where the scale comes from (assuming the GEP isn't marked inbounds).
> > > Can you please elaborate?
> > > 
> > > If this overflows, then the offset here isn't an offset, and the reasoning done later won't be correct.
> > > 
> > > In the test case, the GEPs are marked as inbounds. I don't think that's relevant here.
> > Consider the following testcase:
> > 
> > ```
> > target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
> > define i32* @b1(i32 *%a) {
> > %r = getelementptr i32, i32* %a, i32 -2147483648
> > ret i32* %r
> > }
> > define i32* @b2(i32 *%a) {
> > %o = add i32 -2147483648, 0
> > %r = getelementptr i32, i32* %a, i32 %o
> > ret i32* %r
> > }
> > ```
> > 
> > In both cases, %a and %r should be mustalias.  If you run "-aa-eval", we somehow conclude that the pointers are mustalias in b1, and noalias in b2.   If you're trying to make overflow well-behaved, it doesn't make sense to treat the overflow introduced by GetLinearExpression differently from the overflow inherent in scaling.
> I agree, however, as I attempted to explain in the comment:
> 
>   1. The problem is not in GetLinearExpression itself, it is with how the results of GetLinearExpression are being used here.
>   2. What's happening here, where we're multiplying by the scale (which by itself is always fine), and applying the distributive property, can introduce an overflow in cases where an overflow might not have existed in the original program.
> 
> I believe that the test cases demonstrate this: there are certainly values for the function parameters for which they'll be no overflow in the address calculations.
I think aliasGEP needs to be changed to accept that decompsed GEP arithmetic will overflow, or somehow adopt much more restrictive overflow checking (in many more places than this patch does).

----

Suppose we want to adopt more restrictive overflow checking.  First, so we have an expression `(C1*V+C2)*Scale`.  If we treat `(C1*V+C2)` as opaque, there's one place this can overflow: the multiply by Scale.  Currently, we completely ignore this possibility.

So now, instead of treating the inner expression as opaque, we want to decompose it to something like `(C1*Scale)*V+C2*Scale`.  There are four arithmetic operations here.  All four of them can potentially overflow.  And two of them can't be overflow-checked here because V isn't known at compile-time.

Given that, I don't see how overflow-checking `C2 * Scale` accomplishes anything except hiding the problem for your exact testcase.

https://reviews.llvm.org/D38662