[PATCH] D16343: [BasicAA] Fix for missing must alias information

Fri Jan 29 15:17:39 PST 2016

Aside from the suspected UB (which I think Philip was right about,
see my comments inline below), this seems pretty obviously correct.

A have a couple of other nitpicks below, but after that LGTM.
Philip, did you have anything else on your side?

> On 2016-Jan-27, at 19:55, Gerolf Hoflehner via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> Gerolf updated this revision to Diff 46214.
> Gerolf added a comment.
> 
> Philip, I removed the wrap for Scale as you suggested.

It would be a shame not to fix `Scale` while you're poking around.

I think the most incremental approach would be to outline the
`Scale` math into `adjustToPointerSize()` (creating reusable API
and fixing the UB in the same hopefully-NFC commit), and then add
the change to `BaseOffs` as a second step once Philip signs off.

> Please check if you agree with my comments. Thank you!
> 
> 
> http://reviews.llvm.org/D16343
> 
> Files:
>  lib/Analysis/BasicAliasAnalysis.cpp
>  test/Analysis/BasicAA/noalias-wraparound-bug.ll
> 
> Index: test/Analysis/BasicAA/noalias-wraparound-bug.ll
> ===================================================================
> --- /dev/null
> +++ test/Analysis/BasicAA/noalias-wraparound-bug.ll
> @@ -0,0 +1,24 @@
> +; RUN: opt -S -basicaa -gvn < %s | FileCheck %s
> +
> +target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
> +target triple = "i386-apple-macosx10.6.0"
> +
> +; We incorrectly returned noalias in the example below for "tmp5" and
> +; "tmp12" returning i32 32, since basicaa converted the offsets to 64b
> +; and missed the wrap-around
> +
> +define i32 @foo(i8* %buffer) {
> +entry:
> +  %tmp2 = getelementptr i8, i8* %buffer, i32 -2071408432
> +  %tmp3 = bitcast i8* %tmp2 to i32*
> +  %tmp4 = getelementptr i8, i8* %buffer, i32 128
> +  %tmp5 = bitcast i8* %tmp4 to i32*
> +  store i32 32, i32* %tmp5, align 4
> +  %tmp12 = getelementptr i32, i32* %tmp3, i32 -1629631508
> +  store i32 28, i32* %tmp12, align 4
> +  %tmp13 = getelementptr i8, i8* %buffer, i32 128
> +  %tmp14 = bitcast i8* %tmp13 to i32*
> +  %tmp2083 = load i32, i32* %tmp14, align 4
> +; CHECK: ret i32 28
> +  ret i32 %tmp2083
> +}
> Index: lib/Analysis/BasicAliasAnalysis.cpp
> ===================================================================
> --- lib/Analysis/BasicAliasAnalysis.cpp
> +++ lib/Analysis/BasicAliasAnalysis.cpp
> @@ -42,11 +42,10 @@
> /// Enable analysis of recursive PHI nodes.
> static cl::opt<bool> EnableRecPhiAnalysis("basicaa-recphi", cl::Hidden,
>                                           cl::init(false));
> -
> +#define DEBUG_TYPE "basicaa"
> /// SearchLimitReached / SearchTimes shows how often the limit of
> /// to decompose GEPs is reached. It will affect the precision
> /// of basic alias analysis.
> -#define DEBUG_TYPE "basicaa"

IMO, this cleanup should be separated into a prep NFC commit.
Might as well move the DEBUG_TYPE right under the #includes.

> STATISTIC(SearchLimitReached, "Number of times the limit to "
>                               "decompose GEPs is reached");
> STATISTIC(SearchTimes, "Number of times a GEP is decomposed");
> @@ -319,6 +318,17 @@
>   return V;
> }
> 
> +/// To ensure a pointer offset fits in an integer of size PointerSize
> +/// (in bits) when that size is smaller than 64. This is an issue in
> +/// particular for 32b programs with negative indices that rely on two's
> +/// complement wrap-arounds for correct alias information.
> +static int64_t adjustToPointerSize(int64_t Offset, unsigned PointerSize) {
> +  assert(PointerSize <= 64 && "Invalide PointerSize!");
> +  unsigned ShiftBits = 64 - PointerSize;
> +  Offset <<= ShiftBits;
> +  return Offset >> ShiftBits;

I think Philip might have been right about shifting of signed values
being iffy.  By my reading of C++11 5.8.2, the behaviour here is at
best implementation defined.

Switching to C11 6.5.7, point 4 says:
--
If E1 has a signed type and nonnegative value, and E1 × 2E2 is
representable in the result type, then that is the resulting value;
otherwise, the behavior is undefined.
--
By C11, this is definitely UB.

You could either create a local uint64_t:
--
return uint64_t(Offset) << ShiftBits >> ShiftBits;
--
or just use a mask:
--
assert(PointerSize > 0);
return Offset & UINT64_MAX << ShiftBits;
--

> +}
> +
> /// If V is a symbolic pointer expression, decompose it into a base pointer
> /// with a constant offset and a number of scaled symbolic offsets.
> ///
> @@ -387,6 +397,7 @@
>     unsigned AS = GEPOp->getPointerAddressSpace();
>     // Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
>     gep_type_iterator GTI = gep_type_begin(GEPOp);
> +    unsigned PointerSize = DL.getPointerSizeInBits(AS);
>     for (User::const_op_iterator I = GEPOp->op_begin() + 1, E = GEPOp->op_end();
>          I != E; ++I) {
>       const Value *Index = *I;
> @@ -415,7 +426,6 @@
>       // If the integer type is smaller than the pointer size, it is implicitly
>       // sign extended to pointer size.
>       unsigned Width = Index->getType()->getIntegerBitWidth();
> -      unsigned PointerSize = DL.getPointerSizeInBits(AS);
>      if (PointerSize > Width)
>         SExtBits += PointerSize - Width;
> 
> @@ -457,6 +467,8 @@
>       }
>     }
> 
> +    // Take care of wrap-arounds
> +    BaseOffs = adjustToPointerSize(BaseOffs, PointerSize);
>     // Analyze the base pointer next.
>     V = GEPOp->getOperand(0);
>   } while (--MaxLookup);
> 
> 
> <D16343.46214.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits