[llvm] r294786 - Fix invalid addrspacecast due to combining alloca with global var
Liu, Yaxun (Sam) via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 24 13:38:39 PST 2017
+ Tom
Hi Tom,
What's the process for cherry-picking a commit to a previous LLVM release? Thanks.
Sam
-----Original Message-----
From: James Price [mailto:jp8463 at bristol.ac.uk] On Behalf Of James Price
Sent: Friday, February 24, 2017 4:23 PM
To: hans at chromium.org; Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>
Cc: llvm-commits at lists.llvm.org
Subject: Re: [llvm] r294786 - Fix invalid addrspacecast due to combining alloca with global var
Hans, Sam,
This may be a long shot, but is there any chance that r294786 below, along with the minor bugfix in r296163, could be merged into the 4.0 release? Without these fixes, I experience some regressions in a downstream project with the 4.0 RC compared to 3.9, which I’d love to see fixed.
No worries if it’s too late in the release cycle!
James
> On 10 Feb 2017, at 21:46, Yaxun Liu via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>
> Author: yaxunl
> Date: Fri Feb 10 15:46:07 2017
> New Revision: 294786
>
> URL: http://llvm.org/viewvc/llvm-project?rev=294786&view=rev
> Log:
> Fix invalid addrspacecast due to combining alloca with global var
>
> For function-scope variables with large initialisation list, FE
> usually generates a global variable to hold the initializer, then
> generates memcpy intrinsic to initialize the alloca.
> InstCombiner::visitAllocaInst identifies such allocas which are
> accessed only by reading and replaces them with the global variable.
> This is done by casting the global variable to the type of the alloca and replacing all references.
>
> However, when the global variable is in a different address space
> which is disjoint with addr space 0 (e.g. for IR generated from
> OpenCL, global variable cannot be in private addr space i.e. addr
> space 0), casting the global variable to addr space 0 results in
> invalid IR for certain targets (e.g. amdgpu).
>
> To fix this issue, when the global variable is not in addr space 0,
> instead of casting it to addr space 0, this patch chases down the uses
> of alloca until reaching the load instructions, then replaces load
> from alloca with load from the global variable. If during the chasing
> bitcast and GEP are encountered, new bitcast and GEP based on the
> global variable are generated and used in the load instructions.
>
> Differential Revision: https://reviews.llvm.org/D27283
>
> Added:
> llvm/trunk/test/Transforms/InstCombine/memcpy-addrspace.ll
> Modified:
> llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>
> llvm/trunk/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
>
> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstComb
> ine/InstCombineInternal.h?rev=294786&r1=294785&r2=294786&view=diff
> ======================================================================
> ========
> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
> (original)
> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Fri
> +++ Feb 10 15:46:07 2017
> @@ -322,6 +322,11 @@ public:
> bool replacedSelectWithOperand(SelectInst *SI, const ICmpInst *Icmp,
> const unsigned SIOpd);
>
> + /// Try to replace instruction \p I with value \p V which are
> + pointers /// in different address space.
> + /// \return true if successful.
> + bool replacePointer(Instruction &I, Value *V);
> +
> private:
> bool shouldChangeType(unsigned FromBitWidth, unsigned ToBitWidth) const;
> bool shouldChangeType(Type *From, Type *To) const;
>
> Modified:
> llvm/trunk/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstComb
> ine/InstCombineLoadStoreAlloca.cpp?rev=294786&r1=294785&r2=294786&view
> =diff
> ======================================================================
> ========
> ---
> llvm/trunk/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
> (original)
> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.c
> +++ pp Fri Feb 10 15:46:07 2017
> @@ -12,14 +12,15 @@
> //===-----------------------------------------------------------------
> -----===//
>
> #include "InstCombineInternal.h"
> +#include "llvm/ADT/MapVector.h"
> #include "llvm/ADT/SmallString.h"
> #include "llvm/ADT/Statistic.h"
> #include "llvm/Analysis/Loads.h"
> #include "llvm/IR/ConstantRange.h"
> #include "llvm/IR/DataLayout.h"
> #include "llvm/IR/DebugInfo.h"
> -#include "llvm/IR/LLVMContext.h"
> #include "llvm/IR/IntrinsicInst.h"
> +#include "llvm/IR/LLVMContext.h"
> #include "llvm/IR/MDBuilder.h"
> #include "llvm/Transforms/Utils/BasicBlockUtils.h"
> #include "llvm/Transforms/Utils/Local.h"
> @@ -224,6 +225,103 @@ static Instruction *simplifyAllocaArrayS
> return nullptr;
> }
>
> +// If I and V are pointers in different address space, it is not
> +allowed to // use replaceAllUsesWith since I and V have different
> +types. A // non-target-specific transformation should not use
> +addrspacecast on V since // the two address space may be disjoint depending on target.
> +//
> +// This class chases down uses of the old pointer until reaching the
> +load // instructions, then replaces the old pointer in the load
> +instructions with // the new pointer. If during the chasing it sees
> +bitcast or GEP, it will // create new bitcast or GEP with the new
> +pointer and use them in the load // instruction.
> +class PointerReplacer {
> +public:
> + PointerReplacer(InstCombiner &IC) : IC(IC) {}
> + void replacePointer(Instruction &I, Value *V);
> +
> +private:
> + void findLoadAndReplace(Instruction &I);
> + void replace(Instruction *I);
> + Value *getReplacement(Value *I);
> +
> + SmallVector<Instruction *, 4> Path;
> + MapVector<Value *, Value *> WorkMap;
> + InstCombiner &IC;
> +};
> +
> +void PointerReplacer::findLoadAndReplace(Instruction &I) {
> + for (auto U : I.users()) {
> + auto *Inst = dyn_cast<Instruction>(&*U);
> + if (!Inst)
> + return;
> + DEBUG(dbgs() << "Found pointer user: " << *U << '\n');
> + if (isa<LoadInst>(Inst)) {
> + for (auto P : Path)
> + replace(P);
> + replace(Inst);
> + } else if (isa<GetElementPtrInst>(Inst) || isa<BitCastInst>(Inst)) {
> + Path.push_back(Inst);
> + findLoadAndReplace(*Inst);
> + Path.pop_back();
> + } else {
> + return;
> + }
> + }
> +}
> +
> +Value *PointerReplacer::getReplacement(Value *V) {
> + auto Loc = WorkMap.find(V);
> + if (Loc != WorkMap.end())
> + return Loc->second;
> + return nullptr;
> +}
> +
> +void PointerReplacer::replace(Instruction *I) {
> + if (getReplacement(I))
> + return;
> +
> + if (auto *LT = dyn_cast<LoadInst>(I)) {
> + auto *V = getReplacement(LT->getPointerOperand());
> + assert(V && "Operand not replaced");
> + auto *NewI = new LoadInst(V);
> + NewI->takeName(LT);
> + IC.InsertNewInstWith(NewI, *LT);
> + IC.replaceInstUsesWith(*LT, NewI);
> + WorkMap[LT] = NewI;
> + } else if (auto *GEP = dyn_cast<GetElementPtrInst>(I)) {
> + auto *V = getReplacement(GEP->getPointerOperand());
> + assert(V && "Operand not replaced");
> + SmallVector<Value *, 8> Indices;
> + Indices.append(GEP->idx_begin(), GEP->idx_end());
> + auto *NewI = GetElementPtrInst::Create(
> + V->getType()->getPointerElementType(), V, Indices);
> + IC.InsertNewInstWith(NewI, *GEP);
> + NewI->takeName(GEP);
> + WorkMap[GEP] = NewI;
> + } else if (auto *BC = dyn_cast<BitCastInst>(I)) {
> + auto *V = getReplacement(BC->getOperand(0));
> + assert(V && "Operand not replaced");
> + auto *NewT = PointerType::get(BC->getType()->getPointerElementType(),
> + V->getType()->getPointerAddressSpace());
> + auto *NewI = new BitCastInst(V, NewT);
> + IC.InsertNewInstWith(NewI, *BC);
> + NewI->takeName(BC);
> + WorkMap[GEP] = NewI;
> + } else {
> + llvm_unreachable("should never reach here");
> + }
> +}
> +
> +void PointerReplacer::replacePointer(Instruction &I, Value *V) {
> + auto *PT = cast<PointerType>(I.getType());
> + auto *NT = cast<PointerType>(V->getType());
> + assert(PT != NT && PT->getElementType() == NT->getElementType() &&
> + "Invalid usage");
> + WorkMap[&I] = V;
> + findLoadAndReplace(I);
> +}
> +
> Instruction *InstCombiner::visitAllocaInst(AllocaInst &AI) {
> if (auto *I = simplifyAllocaArraySize(*this, AI))
> return I;
> @@ -294,12 +392,22 @@ Instruction *InstCombiner::visitAllocaIn
> for (unsigned i = 0, e = ToDelete.size(); i != e; ++i)
> eraseInstFromFunction(*ToDelete[i]);
> Constant *TheSrc = cast<Constant>(Copy->getSource());
> - Constant *Cast
> - = ConstantExpr::getPointerBitCastOrAddrSpaceCast(TheSrc, AI.getType());
> - Instruction *NewI = replaceInstUsesWith(AI, Cast);
> - eraseInstFromFunction(*Copy);
> - ++NumGlobalCopies;
> - return NewI;
> + auto *SrcTy = TheSrc->getType();
> + auto *DestTy = PointerType::get(AI.getType()->getPointerElementType(),
> + SrcTy->getPointerAddressSpace());
> + Constant *Cast =
> + ConstantExpr::getPointerBitCastOrAddrSpaceCast(TheSrc, DestTy);
> + if (AI.getType()->getPointerAddressSpace() ==
> + SrcTy->getPointerAddressSpace()) {
> + Instruction *NewI = replaceInstUsesWith(AI, Cast);
> + eraseInstFromFunction(*Copy);
> + ++NumGlobalCopies;
> + return NewI;
> + } else {
> + PointerReplacer PtrReplacer(*this);
> + PtrReplacer.replacePointer(AI, Cast);
> + ++NumGlobalCopies;
> + }
> }
> }
> }
>
> Added: llvm/trunk/test/Transforms/InstCombine/memcpy-addrspace.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCom
> bine/memcpy-addrspace.ll?rev=294786&view=auto
> ======================================================================
> ========
> --- llvm/trunk/test/Transforms/InstCombine/memcpy-addrspace.ll (added)
> +++ llvm/trunk/test/Transforms/InstCombine/memcpy-addrspace.ll Fri Feb
> +++ 10 15:46:07 2017
> @@ -0,0 +1,65 @@
> +; RUN: opt < %s -instcombine -S | FileCheck %s
> +
> + at test.data = private unnamed_addr addrspace(2) constant [8 x i32]
> +[i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7], align 4
> +
> +; CHECK-LABEL: test_load
> +; CHECK: %[[GEP:.*]] = getelementptr [8 x i32], [8 x i32]
> +addrspace(2)* @test.data, i64 0, i64 %x ; CHECK: %{{.*}} = load i32,
> +i32 addrspace(2)* %[[GEP]] ; CHECK-NOT: alloca ; CHECK-NOT: call void
> + at llvm.memcpy.p0i8.p2i8.i64 ; CHECK-NOT: addrspacecast ; CHECK-NOT:
> +load i32, i32* define void @test_load(i32 addrspace(1)* %out, i64 %x)
> +{
> +entry:
> + %data = alloca [8 x i32], align 4
> + %0 = bitcast [8 x i32]* %data to i8*
> + call void @llvm.memcpy.p0i8.p2i8.i64(i8* %0, i8 addrspace(2)*
> +bitcast ([8 x i32] addrspace(2)* @test.data to i8 addrspace(2)*), i64
> +32, i32 4, i1 false)
> + %arrayidx = getelementptr inbounds [8 x i32], [8 x i32]* %data, i64
> +0, i64 %x
> + %1 = load i32, i32* %arrayidx, align 4
> + %arrayidx1 = getelementptr inbounds i32, i32 addrspace(1)* %out,
> +i64 %x
> + store i32 %1, i32 addrspace(1)* %arrayidx1, align 4
> + ret void
> +}
> +
> +; CHECK-LABEL: test_call
> +; CHECK: alloca
> +; CHECK: call void @llvm.memcpy.p0i8.p2i8.i64 ; CHECK-NOT:
> +addrspacecast ; CHECK: call i32 @foo(i32* %{{.*}}) define void
> + at test_call(i32 addrspace(1)* %out, i64 %x) {
> +entry:
> + %data = alloca [8 x i32], align 4
> + %0 = bitcast [8 x i32]* %data to i8*
> + call void @llvm.memcpy.p0i8.p2i8.i64(i8* %0, i8 addrspace(2)*
> +bitcast ([8 x i32] addrspace(2)* @test.data to i8 addrspace(2)*), i64
> +32, i32 4, i1 false)
> + %arrayidx = getelementptr inbounds [8 x i32], [8 x i32]* %data, i64
> +0, i64 %x
> + %1 = call i32 @foo(i32* %arrayidx)
> + %arrayidx1 = getelementptr inbounds i32, i32 addrspace(1)* %out,
> +i64 %x
> + store i32 %1, i32 addrspace(1)* %arrayidx1, align 4
> + ret void
> +}
> +
> +; CHECK-LABEL: test_load_and_call
> +; CHECK: alloca
> +; CHECK: call void @llvm.memcpy.p0i8.p2i8.i64 ; CHECK: load i32, i32*
> +%{{.*}} ; CHECK: call i32 @foo(i32* %{{.*}}) ; CHECK-NOT:
> +addrspacecast ; CHECK-NOT: load i32, i32 addrspace(2)* define void
> + at test_load_and_call(i32 addrspace(1)* %out, i64 %x, i64 %y) {
> +entry:
> + %data = alloca [8 x i32], align 4
> + %0 = bitcast [8 x i32]* %data to i8*
> + call void @llvm.memcpy.p0i8.p2i8.i64(i8* %0, i8 addrspace(2)*
> +bitcast ([8 x i32] addrspace(2)* @test.data to i8 addrspace(2)*), i64
> +32, i32 4, i1 false)
> + %arrayidx = getelementptr inbounds [8 x i32], [8 x i32]* %data, i64
> +0, i64 %x
> + %1 = load i32, i32* %arrayidx, align 4
> + %arrayidx1 = getelementptr inbounds i32, i32 addrspace(1)* %out,
> +i64 %x
> + store i32 %1, i32 addrspace(1)* %arrayidx1, align 4
> + %2 = call i32 @foo(i32* %arrayidx)
> + %arrayidx2 = getelementptr inbounds i32, i32 addrspace(1)* %out,
> +i64 %y
> + store i32 %2, i32 addrspace(1)* %arrayidx2, align 4
> + ret void
> +}
> +
> +
> +declare void @llvm.memcpy.p0i8.p2i8.i64(i8* nocapture writeonly, i8
> +addrspace(2)* nocapture readonly, i64, i32, i1) declare i32 @foo(i32*
> +%x)
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list