[llvm] r223093 - Simplify pointer comparisons involving memory allocation functions

Richard Smith richard at metafoo.co.uk
Wed Dec 3 17:52:02 PST 2014


On Wed, Dec 3, 2014 at 5:35 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> > From: "Richard Smith" <richard at metafoo.co.uk>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> > Sent: Wednesday, December 3, 2014 6:32:05 PM
> > Subject: Re: [llvm] r223093 - Simplify pointer comparisons involving
> memory allocation functions
> >
> > On Mon, Dec 1, 2014 at 3:38 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> >
> >
> > Author: hfinkel
> > Date: Mon Dec 1 17:38:06 2014
> > New Revision: 223093
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=223093&view=rev
> > Log:
> > Simplify pointer comparisons involving memory allocation functions
> >
> > System memory allocation functions, which are identified at the IR
> > level by the
> > noalias attribute on the return value, must return a pointer into a
> > memory region
> > disjoint from any other memory accessible to the caller. We can use
> > this
> > property to simplify pointer comparisons between allocated memory and
> > local
> > stack addresses and the addresses of global variables. Neither the
> > stack nor
> > global variables can overlap with the region used by the memory
> > allocator.
> >
> >
> >
> > I don't believe this change is correct (at least, in theory) as it
> > stands. Here are some examples:
> >
> >
> > 1) I allocate memory with malloc, then use that memory as the stack
> > for a thread. Per POSIX application-managed stack semantics (
> >
> http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html#tag_02_09_08
> > ), the application isn't allowed to use the memory allocated with
> > malloc any more (except through the thread's stack), but there
> > doesn't seem to be any prohibition against comparing pointers to it.
> > I could compare an alloca pointer to the malloc'd buffer to
> > determine whether I'm in that thread, for instance.
> >
> >
> > 1b) I allocate memory with malloc, then I free it, then I perform a
> > dynamic alloca. On my platform, that alloca is lowered to a malloc
> > call (either because my backend feels like doing so, or because I'm
> > using cactus stacks, or whatever).
> >
> >
> > 2) I allocate memory with malloc. Then I free it. Then I compare it
> > to a global. That global lives in a lazily-loaded DSO, and the DSO
> > gets loaded into the memory I just freed. The pointers are now
> > equal.
> >
> >
> > For this to be correct, I think you need to prove that both
> > allocations are live at the same time.
>
> I agree.
>
> >
> >
> > For an alloca, it seems to suffice for the alloca to dominate the
> > allocation call, but it's not sufficient for the allocation call to
> > dominate the alloca unless you can prove the allocation hasn't been
> > freed. Simple fix: require the alloca to be a static alloca.
>
> Yep, restricting to static allocs should work.
>
> >
> >
> > For global variables, this seems harder to fix, but the optimization
> > seems correct if the global is defined in the current module or is
> > marked unnamed_addr.
>
> Does this mean restricting to when isLocalLinkage() || hasUnnamedAddr()?
>

Those cases both seem safe to me (although perhaps we already have a more
general optimization for address comparisons against unnamed_addr symbols).
The optimization also seems valid for symbols with hidden or protected
visibility, since they can never resolve to an entity in another DSO and
thus are live whenever the current function is, but perhaps that knowledge
is too system-specific to be encoding in this optimization.

One other case I was concerned about is this one:

Suppose your malloc implementation has a global buffer that it sometimes
allocates memory from (to handle emergency low-memory situations), and it
provides a function to determine if the memory comes from that buffer. It
was previously valid to mark such a malloc function as
__attribute__((malloc)) / noalias return. Is it still correct with the
recent update to the LangRef? With LTO against the malloc implementation,
you become able to reference the returned memory through another mechanism.
Perhaps it's reasonable to put that down to user error (either your malloc
isn't really __attribute__((malloc)) or you shouldn't include it in your
LTO).

Thanks again,
> Hal
>
> >
> >
> >
> > Fixes PR21556.
> >
> > Added:
> > llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll
> > Modified:
> > llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> >
> > Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> > URL:
> >
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=223093&r1=223092&r2=223093&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original)
> > +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Mon Dec 1
> > 17:38:06 2014
> > @@ -20,6 +20,7 @@
> > #include "llvm/Analysis/InstructionSimplify.h"
> > #include "llvm/ADT/SetVector.h"
> > #include "llvm/ADT/Statistic.h"
> > +#include "llvm/Analysis/AliasAnalysis.h"
> > #include "llvm/Analysis/ConstantFolding.h"
> > #include "llvm/Analysis/MemoryBuiltins.h"
> > #include "llvm/Analysis/ValueTracking.h"
> > @@ -31,6 +32,7 @@
> > #include "llvm/IR/Operator.h"
> > #include "llvm/IR/PatternMatch.h"
> > #include "llvm/IR/ValueHandle.h"
> > +#include <algorithm>
> > using namespace llvm;
> > using namespace llvm::PatternMatch;
> >
> > @@ -2007,6 +2009,39 @@ static Constant *computePointerICmp(cons
> > return ConstantExpr::getICmp(Pred,
> > ConstantExpr::getAdd(LHSOffset, LHSNoBound),
> > ConstantExpr::getAdd(RHSOffset, RHSNoBound));
> > +
> > + // If one side of the equality comparison must come from a noalias
> > call
> > + // (meaning a system memory allocation function), and the other
> > side must
> > + // come from a pointer that cannot overlap with
> > dynamically-allocated
> > + // memory within the lifetime of the current function (allocas,
> > byval
> > + // arguments, globals), then determine the comparison result here.
> > + SmallVector<Value *, 8> LHSUObjs, RHSUObjs;
> > + GetUnderlyingObjects(LHS, LHSUObjs, DL);
> > + GetUnderlyingObjects(RHS, RHSUObjs, DL);
> > +
> > + // Is the set of underlying objects all noalias calls?
> > + auto IsNAC = [](SmallVectorImpl<Value *> &Objects) {
> > + return std::all_of(Objects.begin(), Objects.end(),
> > + [](Value *V){ return isNoAliasCall(V); });
> > + };
> > +
> > + // Is the set of underlying objects all things which must be
> > disjoint from
> > + // noalias calls.
> > + auto IsAllocDisjoint = [](SmallVectorImpl<Value *> &Objects) {
> > + return std::all_of(Objects.begin(), Objects.end(),
> > + [](Value *V){
> > + if (isa<AllocaInst>(V) || isa<GlobalValue>(V))
> > + return true;
> > + if (const Argument *A = dyn_cast<Argument>(V))
> > + return A->hasByValAttr();
> > + return false;
> > + });
> > + };
> > +
> > + if ((IsNAC(LHSUObjs) && IsAllocDisjoint(RHSUObjs)) ||
> > + (IsNAC(RHSUObjs) && IsAllocDisjoint(LHSUObjs)))
> > + return ConstantInt::get(GetCompareTy(LHS),
> > + !CmpInst::isTrueWhenEqual(Pred));
> > }
> >
> > // Otherwise, fail.
> >
> > Added: llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll
> > URL:
> >
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll?rev=223093&view=auto
> >
> ==============================================================================
> > --- llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll (added)
> > +++ llvm/trunk/test/Transforms/InstSimplify/noalias-ptr.ll Mon Dec 1
> > 17:38:06 2014
> > @@ -0,0 +1,108 @@
> > +; RUN: opt -instsimplify -S < %s | FileCheck %s
> > +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> > +target triple = "x86_64-unknown-linux-gnu"
> > +
> > + at g1 = global i32 0, align 4
> > +
> > +; Make sure we can simplify away a pointer comparison between
> > +; dynamically-allocated memory and a local stack allocation.
> > +; void p()
> > +; {
> > +; int *mData;
> > +; int mStackData[10];
> > +; mData = new int[12];
> > +; if (mData != mStackData) {
> > +; delete[] mData;
> > +; }
> > +; }
> > +
> > +define void @_Z2p1v() #0 {
> > + %mStackData = alloca [10 x i32], align 16
> > + %1 = bitcast [10 x i32]* %mStackData to i8*
> > + %2 = tail call noalias i8* @_Znam(i64 48) #4
> > + %3 = bitcast i8* %2 to i32*
> > + %4 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> > + %5 = icmp eq i32* %3, %4
> > + br i1 %5, label %7, label %6
> > +
> > +; CHECK-LABEL: @_Z2p1v
> > +; CHECK-NOT: icmp
> > +; CHECK: ret void
> > +
> > +; <label>:6 ; preds = %0
> > + call void @_ZdaPv(i8* %2) #5
> > + br label %7
> > +
> > +; <label>:7 ; preds = %0, %6
> > + ret void
> > +}
> > +
> > +; Also check a more-complicated case with multiple underlying
> > objects.
> > +
> > +define void @_Z2p2bb(i1 zeroext %b1, i1 zeroext %b2) #0 {
> > + %mStackData = alloca [10 x i32], align 16
> > + %1 = bitcast [10 x i32]* %mStackData to i8*
> > + %2 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> > + %3 = select i1 %b1, i32* %2, i32* @g1
> > + %4 = tail call noalias i8* @_Znam(i64 48) #4
> > + %5 = tail call noalias i8* @_Znam(i64 48) #4
> > + %.v = select i1 %b2, i8* %4, i8* %5
> > + %6 = bitcast i8* %.v to i32*
> > + %7 = icmp eq i32* %6, %3
> > + br i1 %7, label %9, label %8
> > +
> > +; CHECK-LABEL: @_Z2p2bb
> > +; CHECK-NOT: icmp
> > +; CHECK: ret void
> > +
> > +; <label>:8 ; preds = %0
> > + call void @_ZdaPv(i8* %4) #5
> > + call void @_ZdaPv(i8* %5) #5
> > + br label %9
> > +
> > +; <label>:9 ; preds = %0, %8
> > + ret void
> > +}
> > +
> > +; Here's another case involving multiple underlying objects, but
> > this time we
> > +; must keep the comparison (it might involve a regular pointer-typed
> > function
> > +; argument).
> > +
> > +define void @_Z4nopebbPi(i1 zeroext %b1, i1 zeroext %b2, i32*
> > readnone %q) #0 {
> > + %mStackData = alloca [10 x i32], align 16
> > + %1 = bitcast [10 x i32]* %mStackData to i8*
> > + %2 = getelementptr inbounds [10 x i32]* %mStackData, i64 0, i64 0
> > + %3 = select i1 %b1, i32* %2, i32* %q
> > + %4 = tail call noalias i8* @_Znam(i64 48) #4
> > + %5 = tail call noalias i8* @_Znam(i64 48) #4
> > + %.v = select i1 %b2, i8* %4, i8* %5
> > + %6 = bitcast i8* %.v to i32*
> > + %7 = icmp eq i32* %6, %3
> > + br i1 %7, label %9, label %8
> > +
> > +; CHECK-LABEL: @_Z4nopebbPi
> > +; CHECK: icmp
> > +; CHECK: ret void
> > +
> > +; <label>:8 ; preds = %0
> > + call void @_ZdaPv(i8* %4) #5
> > + call void @_ZdaPv(i8* %5) #5
> > + br label %9
> > +
> > +; <label>:9 ; preds = %0, %8
> > + ret void
> > +}
> > +
> > +; Function Attrs: nobuiltin
> > +declare noalias i8* @_Znam(i64) #2
> > +
> > +; Function Attrs: nobuiltin nounwind
> > +declare void @_ZdaPv(i8*) #3
> > +
> > +attributes #0 = { uwtable }
> > +attributes #1 = { nounwind }
> > +attributes #2 = { nobuiltin }
> > +attributes #3 = { nobuiltin nounwind }
> > +attributes #4 = { builtin }
> > +attributes #5 = { builtin nounwind }
> > +
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141203/6a66f347/attachment.html>


More information about the llvm-commits mailing list