[llvm] r223093 - Simplify pointer comparisons involving memory allocation functions

Richard Smith richard at metafoo.co.uk
Wed Dec 3 16:32:05 PST 2014


On Mon, Dec 1, 2014 at 3:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> Author: hfinkel
> Date: Mon Dec  1 17:38:06 2014
> New Revision: 223093
>
> URL: http://llvm.org/viewvc/llvm-project?rev=223093&view=rev
> Log:
> Simplify pointer comparisons involving memory allocation functions
>
> System memory allocation functions, which are identified at the IR level
> by the
> noalias attribute on the return value, must return a pointer into a memory
> region
> disjoint from any other memory accessible to the caller. We can use this
> property to simplify pointer comparisons between allocated memory and local
> stack addresses and the addresses of global variables. Neither the stack
> nor
> global variables can overlap with the region used by the memory allocator.
>

I don't believe this change is correct (at least, in theory) as it stands.
Here are some examples:

1) I allocate memory with malloc, then use that memory as the stack for a
thread. Per POSIX application-managed stack semantics (
http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html#tag_02_09_08),
the application isn't allowed to use the memory allocated with malloc any
more (except through the thread's stack), but there doesn't seem to be any
prohibition against comparing pointers to it. I could compare an alloca
pointer to the malloc'd buffer to determine whether I'm in that thread, for
instance.

1b) I allocate memory with malloc, then I free it, then I perform a dynamic
alloca. On my platform, that alloca is lowered to a malloc call (either
because my backend feels like doing so, or because I'm using cactus stacks,
or whatever).

2) I allocate memory with malloc. Then I free it. Then I compare it to a
global. That global lives in a lazily-loaded DSO, and the DSO gets loaded
into the memory I just freed. The pointers are now equal.

For this to be correct, I think you need to prove that both allocations are
live at the same time.

For an alloca, it seems to suffice for the alloca to dominate the
allocation call, but it's not sufficient for the allocation call to
dominate the alloca unless you can prove the allocation hasn't been freed.
Simple fix: require the alloca to be a static alloca.

For global variables, this seems harder to fix, but the optimization seems
correct if the global is defined in the current module or is marked
unnamed_addr.

Fixes PR21556.
>
> Added:
>     llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll
> Modified:
>     llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>
> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=223093&r1=223092&r2=223093&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original)
> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Mon Dec  1 17:38:06
> 2014
> @@ -20,6 +20,7 @@
>  #include "llvm/Analysis/InstructionSimplify.h"
>  #include "llvm/ADT/SetVector.h"
>  #include "llvm/ADT/Statistic.h"
> +#include "llvm/Analysis/AliasAnalysis.h"
>  #include "llvm/Analysis/ConstantFolding.h"
>  #include "llvm/Analysis/MemoryBuiltins.h"
>  #include "llvm/Analysis/ValueTracking.h"
> @@ -31,6 +32,7 @@
>  #include "llvm/IR/Operator.h"
>  #include "llvm/IR/PatternMatch.h"
>  #include "llvm/IR/ValueHandle.h"
> +#include <algorithm>
>  using namespace llvm;
>  using namespace llvm::PatternMatch;
>
> @@ -2007,6 +2009,39 @@ static Constant *computePointerICmp(cons
>        return ConstantExpr::getICmp(Pred,
>                                     ConstantExpr::getAdd(LHSOffset,
> LHSNoBound),
>                                     ConstantExpr::getAdd(RHSOffset,
> RHSNoBound));
> +
> +    // If one side of the equality comparison must come from a noalias
> call
> +    // (meaning a system memory allocation function), and the other side
> must
> +    // come from a pointer that cannot overlap with dynamically-allocated
> +    // memory within the lifetime of the current function (allocas, byval
> +    // arguments, globals), then determine the comparison result here.
> +    SmallVector<Value *, 8> LHSUObjs, RHSUObjs;
> +    GetUnderlyingObjects(LHS, LHSUObjs, DL);
> +    GetUnderlyingObjects(RHS, RHSUObjs, DL);
> +
> +    // Is the set of underlying objects all noalias calls?
> +    auto IsNAC = [](SmallVectorImpl<Value *> &Objects) {
> +      return std::all_of(Objects.begin(), Objects.end(),
> +                         [](Value *V){ return isNoAliasCall(V); });
> +    };
> +
> +    // Is the set of underlying objects all things which must be disjoint
> from
> +    // noalias calls.
> +    auto IsAllocDisjoint = [](SmallVectorImpl<Value *> &Objects) {
> +      return std::all_of(Objects.begin(), Objects.end(),
> +                         [](Value *V){
> +                           if (isa<AllocaInst>(V) || isa<GlobalValue>(V))
> +                             return true;
> +                           if (const Argument *A = dyn_cast<Argument>(V))
> +                             return A->hasByValAttr();
> +                           return false;
> +                         });
> +    };
> +
> +    if ((IsNAC(LHSUObjs) && IsAllocDisjoint(RHSUObjs)) ||
> +        (IsNAC(RHSUObjs) && IsAllocDisjoint(LHSUObjs)))
> +        return ConstantInt::get(GetCompareTy(LHS),
> +                                !CmpInst::isTrueWhenEqual(Pred));
>    }
>
>    // Otherwise, fail.
>
> Added: llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll?rev=223093&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll (added)
> +++ llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll Mon Dec  1
> 17:38:06 2014
> @@ -0,0 +1,108 @@
> +; RUN: opt -instsimplify -S < %s | FileCheck %s
> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> +target triple = "x86_64-unknown-linux-gnu"
> +
> + at g1 = global i32 0, align 4
> +
> +; Make sure we can simplify away a pointer comparison between
> +; dynamically-allocated memory and a local stack allocation.
> +;   void p()
> +;   {
> +;     int *mData;
> +;     int mStackData[10];
> +;     mData = new int[12];
> +;     if (mData != mStackData) {
> +;       delete[] mData;
> +;     }
> +;   }
> +
> +define void @_Z2p1v() #0 {
> +  %mStackData = alloca [10 x i32], align 16
> +  %1 = bitcast [10 x i32]* %mStackData to i8*
> +  %2 = tail call noalias i8* @_Znam(i64 48) #4
> +  %3 = bitcast i8* %2 to i32*
> +  %4 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> +  %5 = icmp eq i32* %3, %4
> +  br i1 %5, label %7, label %6
> +
> +; CHECK-LABEL: @_Z2p1v
> +; CHECK-NOT: icmp
> +; CHECK: ret void
> +
> +; <label>:6                                       ; preds = %0
> +  call void @_ZdaPv(i8* %2) #5
> +  br label %7
> +
> +; <label>:7                                       ; preds = %0, %6
> +  ret void
> +}
> +
> +; Also check a more-complicated case with multiple underlying objects.
> +
> +define void @_Z2p2bb(i1 zeroext %b1, i1 zeroext %b2) #0 {
> +  %mStackData = alloca [10 x i32], align 16
> +  %1 = bitcast [10 x i32]* %mStackData to i8*
> +  %2 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> +  %3 = select i1 %b1, i32* %2, i32* @g1
> +  %4 = tail call noalias i8* @_Znam(i64 48) #4
> +  %5 = tail call noalias i8* @_Znam(i64 48) #4
> +  %.v = select i1 %b2, i8* %4, i8* %5
> +  %6 = bitcast i8* %.v to i32*
> +  %7 = icmp eq i32* %6, %3
> +  br i1 %7, label %9, label %8
> +
> +; CHECK-LABEL: @_Z2p2bb
> +; CHECK-NOT: icmp
> +; CHECK: ret void
> +
> +; <label>:8                                       ; preds = %0
> +  call void @_ZdaPv(i8* %4) #5
> +  call void @_ZdaPv(i8* %5) #5
> +  br label %9
> +
> +; <label>:9                                       ; preds = %0, %8
> +  ret void
> +}
> +
> +; Here's another case involving multiple underlying objects, but this
> time we
> +; must keep the comparison (it might involve a regular pointer-typed
> function
> +; argument).
> +
> +define void @_Z4nopebbPi(i1 zeroext %b1, i1 zeroext %b2, i32* readnone
> %q) #0 {
> +  %mStackData = alloca [10 x i32], align 16
> +  %1 = bitcast [10 x i32]* %mStackData to i8*
> +  %2 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> +  %3 = select i1 %b1, i32* %2, i32* %q
> +  %4 = tail call noalias i8* @_Znam(i64 48) #4
> +  %5 = tail call noalias i8* @_Znam(i64 48) #4
> +  %.v = select i1 %b2, i8* %4, i8* %5
> +  %6 = bitcast i8* %.v to i32*
> +  %7 = icmp eq i32* %6, %3
> +  br i1 %7, label %9, label %8
> +
> +; CHECK-LABEL: @_Z4nopebbPi
> +; CHECK: icmp
> +; CHECK: ret void
> +
> +; <label>:8                                       ; preds = %0
> +  call void @_ZdaPv(i8* %4) #5
> +  call void @_ZdaPv(i8* %5) #5
> +  br label %9
> +
> +; <label>:9                                       ; preds = %0, %8
> +  ret void
> +}
> +
> +; Function Attrs: nobuiltin
> +declare noalias i8* @_Znam(i64) #2
> +
> +; Function Attrs: nobuiltin nounwind
> +declare void @_ZdaPv(i8*) #3
> +
> +attributes #0 = { uwtable }
> +attributes #1 = { nounwind }
> +attributes #2 = { nobuiltin }
> +attributes #3 = { nobuiltin nounwind }
> +attributes #4 = { builtin }
> +attributes #5 = { builtin nounwind }
> +
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141203/0204d3db/attachment.html>


More information about the llvm-commits mailing list