[llvm] r284311 - [GVN/PRE] Hoist global values outside of loops.
Davide Italiano via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 20 18:46:29 PDT 2016
Reverted in r284796.
On Thu, Oct 20, 2016 at 6:35 PM, Davide Italiano <davide at freebsd.org> wrote:
> I can revert this patch if you prefer.
>
>
> On Oct 20, 2016 6:24 PM, "Philip Reames" <listmail at philipreames.com> wrote:
>>
>> Was there a review thread for this? If so, I missed it, but would have
>> made the following comments.
>>
>> Overall, this patch mixes functional and stylistic changes. The later
>> should have been strictly separate and committed separately. Trying to
>> review this code now is more difficult than it should be.
>>
>> I'm unclear that this patch is overall a good idea. I get that it's
>> *correct*, but it's a break from our current philosophy around hoisting and
>> sinking that needs discussed. If this discussion has happened, please
>> provide a reference so I can catch up on the thread.
>>
>>
>> On 10/15/2016 02:35 PM, Davide Italiano via llvm-commits wrote:
>>>
>>> Author: davide
>>> Date: Sat Oct 15 16:35:23 2016
>>> New Revision: 284311
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=284311&view=rev
>>> Log:
>>> [GVN/PRE] Hoist global values outside of loops.
>>>
>>> In theory this could be generalized to move anything where
>>> we prove the operands are available, but that would require
>>> rewriting PRE. As NewGVN will hopefully come soon, and we're
>>> trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably
>>> not worth spending too much time on it. Fix provided by
>>> Daniel Berlin!
>>>
>>> Added:
>>> llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>>> Modified:
>>> llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>>> llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>>>
>>> Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=284311&r1=284310&r2=284311&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Sat Oct 15 16:35:23 2016
>>> @@ -122,69 +122,84 @@ template <> struct DenseMapInfo<GVN::Exp
>>> /// location of the instruction from which it was formed.
>>> struct llvm::gvn::AvailableValue {
>>> enum ValType {
>>> - SimpleVal, // A simple offsetted value that is accessed.
>>> - LoadVal, // A value produced by a load.
>>> - MemIntrin, // A memory intrinsic which is loaded from.
>>> - UndefVal // A UndefValue representing a value from dead block
>>> (which
>>> - // is not yet physically removed from the CFG).
>>> + SimpleVal, // A simple offsetted value that is accessed.
>>> + LoadVal, // A value produced by a load.
>>> + MemIntrinVal, // A memory intrinsic which is loaded from.
>>> + UndefVal, // A UndefValue representing a value from dead block
>>> (which
>>> + // is not yet physically removed from the CFG).
>>> + CreateLoadVal // A duplicate load can be created higher up in the
>>> CFG that
>>> + // will eliminate this one.
>>> };
>>> /// V - The value that is live out of the block.
>>> - PointerIntPair<Value *, 2, ValType> Val;
>>> + std::pair<Value *, ValType> Val;
>>
>> Why not use a PointerIntPair<Value *, 3>? I'm pretty sure our Value's are
>> at least 8 byte aligned?
>>
>> Changing this, and the related code churn, should have been done
>> separately.
>>>
>>> /// Offset - The byte offset in Val that is interesting for the
>>> load query.
>>> unsigned Offset;
>>> static AvailableValue get(Value *V, unsigned Offset = 0) {
>>> AvailableValue Res;
>>> - Res.Val.setPointer(V);
>>> - Res.Val.setInt(SimpleVal);
>>> + Res.Val.first = V;
>>> + Res.Val.second = SimpleVal;
>>> Res.Offset = Offset;
>>> return Res;
>>> }
>>> static AvailableValue getMI(MemIntrinsic *MI, unsigned Offset = 0)
>>> {
>>> AvailableValue Res;
>>> - Res.Val.setPointer(MI);
>>> - Res.Val.setInt(MemIntrin);
>>> + Res.Val.first = MI;
>>> + Res.Val.second = MemIntrinVal;
>>> Res.Offset = Offset;
>>> return Res;
>>> }
>>> + static AvailableValue getCreateLoad(LoadInst *LI) {
>>> + AvailableValue Res;
>>> + Res.Val.first = LI;
>>> + Res.Val.second = CreateLoadVal;
>>> + return Res;
>>> + }
>>> +
>>> static AvailableValue getLoad(LoadInst *LI, unsigned Offset = 0) {
>>> AvailableValue Res;
>>> - Res.Val.setPointer(LI);
>>> - Res.Val.setInt(LoadVal);
>>> + Res.Val.first = LI;
>>> + Res.Val.second = LoadVal;
>>> Res.Offset = Offset;
>>> return Res;
>>> }
>>> static AvailableValue getUndef() {
>>> AvailableValue Res;
>>> - Res.Val.setPointer(nullptr);
>>> - Res.Val.setInt(UndefVal);
>>> + Res.Val.first = nullptr;
>>> + Res.Val.second = UndefVal;
>>> Res.Offset = 0;
>>> return Res;
>>> }
>>> - bool isSimpleValue() const { return Val.getInt() == SimpleVal; }
>>> - bool isCoercedLoadValue() const { return Val.getInt() == LoadVal; }
>>> - bool isMemIntrinValue() const { return Val.getInt() == MemIntrin; }
>>> - bool isUndefValue() const { return Val.getInt() == UndefVal; }
>>> + bool isSimpleValue() const { return Val.second == SimpleVal; }
>>> + bool isCoercedLoadValue() const { return Val.second == LoadVal; }
>>> + bool isMemIntrinValue() const { return Val.second == MemIntrinVal; }
>>> + bool isUndefValue() const { return Val.second == UndefVal; }
>>> + bool isCreateLoadValue() const { return Val.second == CreateLoadVal; }
>>> +
>>> + LoadInst *getCreateLoadValue() const {
>>> + assert(isCreateLoadValue() && "Wrong accessor");
>>> + return cast<LoadInst>(Val.first);
>>> + }
>>> Value *getSimpleValue() const {
>>> assert(isSimpleValue() && "Wrong accessor");
>>> - return Val.getPointer();
>>> + return Val.first;
>>> }
>>> LoadInst *getCoercedLoadValue() const {
>>> assert(isCoercedLoadValue() && "Wrong accessor");
>>> - return cast<LoadInst>(Val.getPointer());
>>> + return cast<LoadInst>(Val.first);
>>> }
>>> MemIntrinsic *getMemIntrinValue() const {
>>> assert(isMemIntrinValue() && "Wrong accessor");
>>> - return cast<MemIntrinsic>(Val.getPointer());
>>> + return cast<MemIntrinsic>(Val.first);
>>> }
>>> /// Emit code at the specified insertion point to adjust the value
>>> defined
>>> @@ -1191,7 +1206,11 @@ Value *AvailableValue::MaterializeAdjust
>>> Value *Res;
>>> Type *LoadTy = LI->getType();
>>> const DataLayout &DL = LI->getModule()->getDataLayout();
>>> - if (isSimpleValue()) {
>>> + if (isCreateLoadValue()) {
>>> + Instruction *I = getCreateLoadValue()->clone();
>>> + I->insertBefore(InsertPt);
>>> + Res = I;
>>
>> This case is rare, please leave the simpleValue case first and put this at
>> the bottom of the chain.
>>>
>>> + } else if (isSimpleValue()) {
>>> Res = getSimpleValue();
>>> if (Res->getType() != LoadTy) {
>>> Res = GetStoreValueForLoad(Res, Offset, LoadTy, InsertPt, DL);
>>> @@ -1379,7 +1398,7 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>>> continue;
>>> }
>>> - if (!DepInfo.isDef() && !DepInfo.isClobber()) {
>>> + if (!DepInfo.isDef() && !DepInfo.isClobber() &&
>>> !DepInfo.isNonFuncLocal()) {
>>> UnavailableBlocks.push_back(DepBB);
>>> continue;
>>> }
>>> @@ -1390,12 +1409,25 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>>> Value *Address = Deps[i].getAddress();
>>> AvailableValue AV;
>>> - if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>>> + // TODO: We can use anything where the operands are available, and
>>> we should
>>> + // learn to recreate operands in other blocks if they are available.
>>> + // Because we don't have the infrastructure in our PRE, we restrict
>>> this to
>>> + // global values, because we know the operands are always available.
>>> + if (DepInfo.isNonFuncLocal()) {
>>> + if (isSafeToSpeculativelyExecute(LI) &&
>>> + isa<GlobalValue>(LI->getPointerOperand())) {
>>
>> This is overly specific. This also works for function arguments, etc..
>>>
>>> + AV = AvailableValue::getCreateLoad(LI);
>>> + ValuesPerBlock.push_back(AvailableValueInBlock::get(
>>> + &LI->getParent()->getParent()->getEntryBlock(),
>>> std::move(AV)));
>>
>> The entry block may be a horrible place to put this load. How was this
>> heuristic evaluated?
>>>
>>> + } else
>>> + UnavailableBlocks.push_back(DepBB);
>>> +
>>> + } else if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>>> // subtlety: because we know this was a non-local dependency, we
>>> know
>>> // it's safe to materialize anywhere between the instruction
>>> within
>>> // DepInfo and the end of it's block.
>>> - ValuesPerBlock.push_back(AvailableValueInBlock::get(DepBB,
>>> -
>>> std::move(AV)));
>>> + ValuesPerBlock.push_back(
>>> + AvailableValueInBlock::get(DepBB, std::move(AV)));
>>> } else {
>>> UnavailableBlocks.push_back(DepBB);
>>> }
>>>
>>> Added: llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll?rev=284311&view=auto
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll (added)
>>> +++ llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll Sat Oct 15 16:35:23
>>> 2016
>>> @@ -0,0 +1,53 @@
>>> +; RUN: opt -gvn %s -S -o - | FileCheck %s
>>> +
>>> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>> +target triple = "x86_64-unknown-linux-gnu"
>>> +
>>> +; Check that the loads of @aaa, @bbb and @ccc are hoisted.
>>> +; CHECK-LABEL: define void @foo
>>> +; CHECK-NEXT: %2 = load i32, i32* @ccc, align 4
>>> +; CHECK-NEXT: %3 = load i32, i32* @bbb, align 4
>>> +; CHECK-NEXT: %4 = load i32, i32* @aaa, align 4
>>> +
>>> + at aaa = local_unnamed_addr global i32 10, align 4
>>> + at bbb = local_unnamed_addr global i32 20, align 4
>>> + at ccc = local_unnamed_addr global i32 30, align 4
>>> +
>>> +define void @foo(i32* nocapture readonly) local_unnamed_addr {
>>> + br label %2
>>> +
>>> + %.0 = phi i32* [ %0, %1 ], [ %3, %22 ]
>>> + %3 = getelementptr inbounds i32, i32* %.0, i64 1
>>> + %4 = load i32, i32* %.0, align 4
>>> + %5 = load i32, i32* @ccc, align 4
>>> + %6 = icmp sgt i32 %4, %5
>>> + br i1 %6, label %7, label %10
>>> +
>>> + %8 = load i32, i32* @bbb, align 4
>>> + %9 = add nsw i32 %8, %4
>>> + store i32 %9, i32* @bbb, align 4
>>> + br label %10
>>> +
>>> + %11 = load i32, i32* @bbb, align 4
>>> + %12 = icmp sgt i32 %4, %11
>>> + br i1 %12, label %13, label %16
>>> +
>>> + %14 = load i32, i32* @aaa, align 4
>>> + %15 = add nsw i32 %14, %4
>>> + store i32 %15, i32* @aaa, align 4
>>> + br label %16
>>> +
>>> + %17 = load i32, i32* @aaa, align 4
>>> + %18 = icmp sgt i32 %4, %17
>>> + br i1 %18, label %19, label %22
>>> +
>>> + %20 = load i32, i32* @ccc, align 4
>>> + %21 = add nsw i32 %20, %4
>>> + store i32 %21, i32* @ccc, align 4
>>> + br label %22
>>> +
>>> + %23 = icmp slt i32 %4, 0
>>> + br i1 %23, label %24, label %2
>>> +
>>> + ret void
>>> +}
>>
>> Please add several more test cases which exercise the trivial case, a case
>> where we *can't* hoist, and a case where we can hoist because the load is
>> explicitly marked invariant.
>>>
>>>
>>> Modified: llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll?rev=284311&r1=284310&r2=284311&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll (original)
>>> +++ llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll Sat Oct 15
>>> 16:35:23 2016
>>> @@ -1,16 +1,9 @@
>>> ; RUN: opt < %s -gvn -enable-load-pre -S | FileCheck %s
>>> -; This testcase assumed we'll PRE the load into %for.cond, but we don't
>>> actually
>>> -; verify that doing so is safe. If there didn't _happen_ to be a load
>>> in
>>> -; %for.end, we would actually be lengthening the execution on some
>>> paths, and
>>> -; we were never actually checking that case. Now we actually do perform
>>> some
>>> -; conservative checking to make sure we don't make paths longer, but we
>>> don't
>>> -; currently get this case, which we got lucky on previously.
>>> -;
>>> -; Now that that faulty assumption is corrected, test that we DON'T
>>> incorrectly
>>> -; hoist the load. Doing the right thing for the wrong reasons is still
>>> a bug.
>>> @p = external global i32
>>> define i32 @f(i32 %n) nounwind {
>>> +; CHECK: entry:
>>> +; CHECK-NEXT: %0 = load i32, i32* @p
>>> entry:
>>> br label %for.cond
>>> @@ -22,8 +15,9 @@ for.cond: ; preds = %for.inc, %entry
>>> for.cond.for.end_crit_edge: ; preds = %for.cond
>>> br label %for.end
>>> +; The load of @p should be hoisted into the entry block.
>>> ; CHECK: for.body:
>>> -; CHECK-NEXT: %tmp3 = load i32, i32* @p
>>> +; CHECK-NEXT: %dec = add i32 %tmp3, -1
>>> for.body: ; preds = %for.cond
>>> %tmp3 = load i32, i32* @p ; <i32> [#uses=1]
>>> %dec = add i32 %tmp3, -1 ; <i32> [#uses=2]
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>
--
Davide
"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare
More information about the llvm-commits
mailing list