[llvm] r284311 - [GVN/PRE] Hoist global values outside of loops.
Davide Italiano via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 20 18:35:03 PDT 2016
I can revert this patch if you prefer.
On Oct 20, 2016 6:24 PM, "Philip Reames" <listmail at philipreames.com> wrote:
> Was there a review thread for this? If so, I missed it, but would have
> made the following comments.
>
> Overall, this patch mixes functional and stylistic changes. The later
> should have been strictly separate and committed separately. Trying to
> review this code now is more difficult than it should be.
>
> I'm unclear that this patch is overall a good idea. I get that it's
> *correct*, but it's a break from our current philosophy around hoisting and
> sinking that needs discussed. If this discussion has happened, please
> provide a reference so I can catch up on the thread.
>
>
> On 10/15/2016 02:35 PM, Davide Italiano via llvm-commits wrote:
>
>> Author: davide
>> Date: Sat Oct 15 16:35:23 2016
>> New Revision: 284311
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=284311&view=rev
>> Log:
>> [GVN/PRE] Hoist global values outside of loops.
>>
>> In theory this could be generalized to move anything where
>> we prove the operands are available, but that would require
>> rewriting PRE. As NewGVN will hopefully come soon, and we're
>> trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably
>> not worth spending too much time on it. Fix provided by
>> Daniel Berlin!
>>
>> Added:
>> llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>> Modified:
>> llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transform
>> s/Scalar/GVN.cpp?rev=284311&r1=284310&r2=284311&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Sat Oct 15 16:35:23 2016
>> @@ -122,69 +122,84 @@ template <> struct DenseMapInfo<GVN::Exp
>> /// location of the instruction from which it was formed.
>> struct llvm::gvn::AvailableValue {
>> enum ValType {
>> - SimpleVal, // A simple offsetted value that is accessed.
>> - LoadVal, // A value produced by a load.
>> - MemIntrin, // A memory intrinsic which is loaded from.
>> - UndefVal // A UndefValue representing a value from dead block
>> (which
>> - // is not yet physically removed from the CFG).
>> + SimpleVal, // A simple offsetted value that is accessed.
>> + LoadVal, // A value produced by a load.
>> + MemIntrinVal, // A memory intrinsic which is loaded from.
>> + UndefVal, // A UndefValue representing a value from dead block
>> (which
>> + // is not yet physically removed from the CFG).
>> + CreateLoadVal // A duplicate load can be created higher up in the
>> CFG that
>> + // will eliminate this one.
>> };
>> /// V - The value that is live out of the block.
>> - PointerIntPair<Value *, 2, ValType> Val;
>> + std::pair<Value *, ValType> Val;
>>
> Why not use a PointerIntPair<Value *, 3>? I'm pretty sure our Value's are
> at least 8 byte aligned?
>
> Changing this, and the related code churn, should have been done
> separately.
>
>> /// Offset - The byte offset in Val that is interesting for the
>> load query.
>> unsigned Offset;
>> static AvailableValue get(Value *V, unsigned Offset = 0) {
>> AvailableValue Res;
>> - Res.Val.setPointer(V);
>> - Res.Val.setInt(SimpleVal);
>> + Res.Val.first = V;
>> + Res.Val.second = SimpleVal;
>> Res.Offset = Offset;
>> return Res;
>> }
>> static AvailableValue getMI(MemIntrinsic *MI, unsigned Offset = 0) {
>> AvailableValue Res;
>> - Res.Val.setPointer(MI);
>> - Res.Val.setInt(MemIntrin);
>> + Res.Val.first = MI;
>> + Res.Val.second = MemIntrinVal;
>> Res.Offset = Offset;
>> return Res;
>> }
>> + static AvailableValue getCreateLoad(LoadInst *LI) {
>> + AvailableValue Res;
>> + Res.Val.first = LI;
>> + Res.Val.second = CreateLoadVal;
>> + return Res;
>> + }
>> +
>> static AvailableValue getLoad(LoadInst *LI, unsigned Offset = 0) {
>> AvailableValue Res;
>> - Res.Val.setPointer(LI);
>> - Res.Val.setInt(LoadVal);
>> + Res.Val.first = LI;
>> + Res.Val.second = LoadVal;
>> Res.Offset = Offset;
>> return Res;
>> }
>> static AvailableValue getUndef() {
>> AvailableValue Res;
>> - Res.Val.setPointer(nullptr);
>> - Res.Val.setInt(UndefVal);
>> + Res.Val.first = nullptr;
>> + Res.Val.second = UndefVal;
>> Res.Offset = 0;
>> return Res;
>> }
>> - bool isSimpleValue() const { return Val.getInt() == SimpleVal; }
>> - bool isCoercedLoadValue() const { return Val.getInt() == LoadVal; }
>> - bool isMemIntrinValue() const { return Val.getInt() == MemIntrin; }
>> - bool isUndefValue() const { return Val.getInt() == UndefVal; }
>> + bool isSimpleValue() const { return Val.second == SimpleVal; }
>> + bool isCoercedLoadValue() const { return Val.second == LoadVal; }
>> + bool isMemIntrinValue() const { return Val.second == MemIntrinVal; }
>> + bool isUndefValue() const { return Val.second == UndefVal; }
>> + bool isCreateLoadValue() const { return Val.second == CreateLoadVal; }
>> +
>> + LoadInst *getCreateLoadValue() const {
>> + assert(isCreateLoadValue() && "Wrong accessor");
>> + return cast<LoadInst>(Val.first);
>> + }
>> Value *getSimpleValue() const {
>> assert(isSimpleValue() && "Wrong accessor");
>> - return Val.getPointer();
>> + return Val.first;
>> }
>> LoadInst *getCoercedLoadValue() const {
>> assert(isCoercedLoadValue() && "Wrong accessor");
>> - return cast<LoadInst>(Val.getPointer());
>> + return cast<LoadInst>(Val.first);
>> }
>> MemIntrinsic *getMemIntrinValue() const {
>> assert(isMemIntrinValue() && "Wrong accessor");
>> - return cast<MemIntrinsic>(Val.getPointer());
>> + return cast<MemIntrinsic>(Val.first);
>> }
>> /// Emit code at the specified insertion point to adjust the value
>> defined
>> @@ -1191,7 +1206,11 @@ Value *AvailableValue::MaterializeAdjust
>> Value *Res;
>> Type *LoadTy = LI->getType();
>> const DataLayout &DL = LI->getModule()->getDataLayout();
>> - if (isSimpleValue()) {
>> + if (isCreateLoadValue()) {
>> + Instruction *I = getCreateLoadValue()->clone();
>> + I->insertBefore(InsertPt);
>> + Res = I;
>>
> This case is rare, please leave the simpleValue case first and put this at
> the bottom of the chain.
>
>> + } else if (isSimpleValue()) {
>> Res = getSimpleValue();
>> if (Res->getType() != LoadTy) {
>> Res = GetStoreValueForLoad(Res, Offset, LoadTy, InsertPt, DL);
>> @@ -1379,7 +1398,7 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>> continue;
>> }
>> - if (!DepInfo.isDef() && !DepInfo.isClobber()) {
>> + if (!DepInfo.isDef() && !DepInfo.isClobber() &&
>> !DepInfo.isNonFuncLocal()) {
>> UnavailableBlocks.push_back(DepBB);
>> continue;
>> }
>> @@ -1390,12 +1409,25 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>> Value *Address = Deps[i].getAddress();
>> AvailableValue AV;
>> - if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>> + // TODO: We can use anything where the operands are available, and
>> we should
>> + // learn to recreate operands in other blocks if they are available.
>> + // Because we don't have the infrastructure in our PRE, we restrict
>> this to
>> + // global values, because we know the operands are always available.
>> + if (DepInfo.isNonFuncLocal()) {
>> + if (isSafeToSpeculativelyExecute(LI) &&
>> + isa<GlobalValue>(LI->getPointerOperand())) {
>>
> This is overly specific. This also works for function arguments, etc..
>
>> + AV = AvailableValue::getCreateLoad(LI);
>> + ValuesPerBlock.push_back(AvailableValueInBlock::get(
>> + &LI->getParent()->getParent()->getEntryBlock(),
>> std::move(AV)));
>>
> The entry block may be a horrible place to put this load. How was this
> heuristic evaluated?
>
>> + } else
>> + UnavailableBlocks.push_back(DepBB);
>> +
>> + } else if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>> // subtlety: because we know this was a non-local dependency, we
>> know
>> // it's safe to materialize anywhere between the instruction
>> within
>> // DepInfo and the end of it's block.
>> - ValuesPerBlock.push_back(AvailableValueInBlock::get(DepBB,
>> -
>> std::move(AV)));
>> + ValuesPerBlock.push_back(
>> + AvailableValueInBlock::get(DepBB, std::move(AV)));
>> } else {
>> UnavailableBlocks.push_back(DepBB);
>> }
>>
>> Added: llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transfor
>> ms/GVN/PRE/hoist-loads.ll?rev=284311&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll (added)
>> +++ llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll Sat Oct 15
>> 16:35:23 2016
>> @@ -0,0 +1,53 @@
>> +; RUN: opt -gvn %s -S -o - | FileCheck %s
>> +
>> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>> +target triple = "x86_64-unknown-linux-gnu"
>> +
>> +; Check that the loads of @aaa, @bbb and @ccc are hoisted.
>> +; CHECK-LABEL: define void @foo
>> +; CHECK-NEXT: %2 = load i32, i32* @ccc, align 4
>> +; CHECK-NEXT: %3 = load i32, i32* @bbb, align 4
>> +; CHECK-NEXT: %4 = load i32, i32* @aaa, align 4
>> +
>> + at aaa = local_unnamed_addr global i32 10, align 4
>> + at bbb = local_unnamed_addr global i32 20, align 4
>> + at ccc = local_unnamed_addr global i32 30, align 4
>> +
>> +define void @foo(i32* nocapture readonly) local_unnamed_addr {
>> + br label %2
>> +
>> + %.0 = phi i32* [ %0, %1 ], [ %3, %22 ]
>> + %3 = getelementptr inbounds i32, i32* %.0, i64 1
>> + %4 = load i32, i32* %.0, align 4
>> + %5 = load i32, i32* @ccc, align 4
>> + %6 = icmp sgt i32 %4, %5
>> + br i1 %6, label %7, label %10
>> +
>> + %8 = load i32, i32* @bbb, align 4
>> + %9 = add nsw i32 %8, %4
>> + store i32 %9, i32* @bbb, align 4
>> + br label %10
>> +
>> + %11 = load i32, i32* @bbb, align 4
>> + %12 = icmp sgt i32 %4, %11
>> + br i1 %12, label %13, label %16
>> +
>> + %14 = load i32, i32* @aaa, align 4
>> + %15 = add nsw i32 %14, %4
>> + store i32 %15, i32* @aaa, align 4
>> + br label %16
>> +
>> + %17 = load i32, i32* @aaa, align 4
>> + %18 = icmp sgt i32 %4, %17
>> + br i1 %18, label %19, label %22
>> +
>> + %20 = load i32, i32* @ccc, align 4
>> + %21 = add nsw i32 %20, %4
>> + store i32 %21, i32* @ccc, align 4
>> + br label %22
>> +
>> + %23 = icmp slt i32 %4, 0
>> + br i1 %23, label %24, label %2
>> +
>> + ret void
>> +}
>>
> Please add several more test cases which exercise the trivial case, a case
> where we *can't* hoist, and a case where we can hoist because the load is
> explicitly marked invariant.
>
>>
>> Modified: llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transfor
>> ms/GVN/PRE/pre-single-pred.ll?rev=284311&r1=284310&r2=284311&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll (original)
>> +++ llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll Sat Oct 15
>> 16:35:23 2016
>> @@ -1,16 +1,9 @@
>> ; RUN: opt < %s -gvn -enable-load-pre -S | FileCheck %s
>> -; This testcase assumed we'll PRE the load into %for.cond, but we don't
>> actually
>> -; verify that doing so is safe. If there didn't _happen_ to be a load in
>> -; %for.end, we would actually be lengthening the execution on some
>> paths, and
>> -; we were never actually checking that case. Now we actually do perform
>> some
>> -; conservative checking to make sure we don't make paths longer, but we
>> don't
>> -; currently get this case, which we got lucky on previously.
>> -;
>> -; Now that that faulty assumption is corrected, test that we DON'T
>> incorrectly
>> -; hoist the load. Doing the right thing for the wrong reasons is still
>> a bug.
>> @p = external global i32
>> define i32 @f(i32 %n) nounwind {
>> +; CHECK: entry:
>> +; CHECK-NEXT: %0 = load i32, i32* @p
>> entry:
>> br label %for.cond
>> @@ -22,8 +15,9 @@ for.cond: ; preds = %for.inc, %entry
>> for.cond.for.end_crit_edge: ; preds = %for.cond
>> br label %for.end
>> +; The load of @p should be hoisted into the entry block.
>> ; CHECK: for.body:
>> -; CHECK-NEXT: %tmp3 = load i32, i32* @p
>> +; CHECK-NEXT: %dec = add i32 %tmp3, -1
>> for.body: ; preds = %for.cond
>> %tmp3 = load i32, i32* @p ; <i32> [#uses=1]
>> %dec = add i32 %tmp3, -1 ; <i32> [#uses=2]
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161020/c72d6489/attachment.html>
More information about the llvm-commits
mailing list