[llvm] r284311 - [GVN/PRE] Hoist global values outside of loops.

Davide Italiano via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 20 18:35:03 PDT 2016


I can revert this patch if you prefer.

On Oct 20, 2016 6:24 PM, "Philip Reames" <listmail at philipreames.com> wrote:

> Was there a review thread for this?  If so, I missed it, but would have
> made the following comments.
>
> Overall, this patch mixes functional and stylistic changes.  The later
> should have been strictly separate and committed separately.  Trying to
> review this code now is more difficult than it should be.
>
> I'm unclear that this patch is overall a good idea.  I get that it's
> *correct*, but it's a break from our current philosophy around hoisting and
> sinking that needs discussed.   If this discussion has happened, please
> provide a reference so I can catch up on the thread.
>
>
> On 10/15/2016 02:35 PM, Davide Italiano via llvm-commits wrote:
>
>> Author: davide
>> Date: Sat Oct 15 16:35:23 2016
>> New Revision: 284311
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=284311&view=rev
>> Log:
>> [GVN/PRE] Hoist global values outside of loops.
>>
>> In theory this could be generalized to move anything where
>> we prove the operands are available, but that would require
>> rewriting PRE. As NewGVN will hopefully come soon, and we're
>> trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably
>> not worth spending too much time on it. Fix provided by
>> Daniel Berlin!
>>
>> Added:
>>      llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>> Modified:
>>      llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>>      llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transform
>> s/Scalar/GVN.cpp?rev=284311&r1=284310&r2=284311&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Sat Oct 15 16:35:23 2016
>> @@ -122,69 +122,84 @@ template <> struct DenseMapInfo<GVN::Exp
>>   /// location of the instruction from which it was formed.
>>   struct llvm::gvn::AvailableValue {
>>     enum ValType {
>> -    SimpleVal, // A simple offsetted value that is accessed.
>> -    LoadVal,   // A value produced by a load.
>> -    MemIntrin, // A memory intrinsic which is loaded from.
>> -    UndefVal   // A UndefValue representing a value from dead block
>> (which
>> -               // is not yet physically removed from the CFG).
>> +    SimpleVal,    // A simple offsetted value that is accessed.
>> +    LoadVal,      // A value produced by a load.
>> +    MemIntrinVal, // A memory intrinsic which is loaded from.
>> +    UndefVal,     // A UndefValue representing a value from dead block
>> (which
>> +                  // is not yet physically removed from the CFG).
>> +    CreateLoadVal // A duplicate load can be created higher up in the
>> CFG that
>> +                  // will eliminate this one.
>>     };
>>       /// V - The value that is live out of the block.
>> -  PointerIntPair<Value *, 2, ValType> Val;
>> +  std::pair<Value *, ValType> Val;
>>
> Why not use a PointerIntPair<Value *, 3>?  I'm pretty sure our Value's are
> at least 8 byte aligned?
>
> Changing this, and the related code churn, should have been done
> separately.
>
>>       /// Offset - The byte offset in Val that is interesting for the
>> load query.
>>     unsigned Offset;
>>       static AvailableValue get(Value *V, unsigned Offset = 0) {
>>       AvailableValue Res;
>> -    Res.Val.setPointer(V);
>> -    Res.Val.setInt(SimpleVal);
>> +    Res.Val.first = V;
>> +    Res.Val.second = SimpleVal;
>>       Res.Offset = Offset;
>>       return Res;
>>     }
>>       static AvailableValue getMI(MemIntrinsic *MI, unsigned Offset = 0) {
>>       AvailableValue Res;
>> -    Res.Val.setPointer(MI);
>> -    Res.Val.setInt(MemIntrin);
>> +    Res.Val.first = MI;
>> +    Res.Val.second = MemIntrinVal;
>>       Res.Offset = Offset;
>>       return Res;
>>     }
>>   +  static AvailableValue getCreateLoad(LoadInst *LI) {
>> +    AvailableValue Res;
>> +    Res.Val.first = LI;
>> +    Res.Val.second = CreateLoadVal;
>> +    return Res;
>> +  }
>> +
>>     static AvailableValue getLoad(LoadInst *LI, unsigned Offset = 0) {
>>       AvailableValue Res;
>> -    Res.Val.setPointer(LI);
>> -    Res.Val.setInt(LoadVal);
>> +    Res.Val.first = LI;
>> +    Res.Val.second = LoadVal;
>>       Res.Offset = Offset;
>>       return Res;
>>     }
>>       static AvailableValue getUndef() {
>>       AvailableValue Res;
>> -    Res.Val.setPointer(nullptr);
>> -    Res.Val.setInt(UndefVal);
>> +    Res.Val.first = nullptr;
>> +    Res.Val.second = UndefVal;
>>       Res.Offset = 0;
>>       return Res;
>>     }
>>   -  bool isSimpleValue() const { return Val.getInt() == SimpleVal; }
>> -  bool isCoercedLoadValue() const { return Val.getInt() == LoadVal; }
>> -  bool isMemIntrinValue() const { return Val.getInt() == MemIntrin; }
>> -  bool isUndefValue() const { return Val.getInt() == UndefVal; }
>> +  bool isSimpleValue() const { return Val.second == SimpleVal; }
>> +  bool isCoercedLoadValue() const { return Val.second == LoadVal; }
>> +  bool isMemIntrinValue() const { return Val.second == MemIntrinVal; }
>> +  bool isUndefValue() const { return Val.second == UndefVal; }
>> +  bool isCreateLoadValue() const { return Val.second == CreateLoadVal; }
>> +
>> +  LoadInst *getCreateLoadValue() const {
>> +    assert(isCreateLoadValue() && "Wrong accessor");
>> +    return cast<LoadInst>(Val.first);
>> +  }
>>       Value *getSimpleValue() const {
>>       assert(isSimpleValue() && "Wrong accessor");
>> -    return Val.getPointer();
>> +    return Val.first;
>>     }
>>       LoadInst *getCoercedLoadValue() const {
>>       assert(isCoercedLoadValue() && "Wrong accessor");
>> -    return cast<LoadInst>(Val.getPointer());
>> +    return cast<LoadInst>(Val.first);
>>     }
>>       MemIntrinsic *getMemIntrinValue() const {
>>       assert(isMemIntrinValue() && "Wrong accessor");
>> -    return cast<MemIntrinsic>(Val.getPointer());
>> +    return cast<MemIntrinsic>(Val.first);
>>     }
>>       /// Emit code at the specified insertion point to adjust the value
>> defined
>> @@ -1191,7 +1206,11 @@ Value *AvailableValue::MaterializeAdjust
>>     Value *Res;
>>     Type *LoadTy = LI->getType();
>>     const DataLayout &DL = LI->getModule()->getDataLayout();
>> -  if (isSimpleValue()) {
>> +  if (isCreateLoadValue()) {
>> +    Instruction *I = getCreateLoadValue()->clone();
>> +    I->insertBefore(InsertPt);
>> +    Res = I;
>>
> This case is rare, please leave the simpleValue case first and put this at
> the bottom of the chain.
>
>> +  } else if (isSimpleValue()) {
>>       Res = getSimpleValue();
>>       if (Res->getType() != LoadTy) {
>>         Res = GetStoreValueForLoad(Res, Offset, LoadTy, InsertPt, DL);
>> @@ -1379,7 +1398,7 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>>         continue;
>>       }
>>   -    if (!DepInfo.isDef() && !DepInfo.isClobber()) {
>> +    if (!DepInfo.isDef() && !DepInfo.isClobber() &&
>> !DepInfo.isNonFuncLocal()) {
>>         UnavailableBlocks.push_back(DepBB);
>>         continue;
>>       }
>> @@ -1390,12 +1409,25 @@ void GVN::AnalyzeLoadAvailability(LoadIn
>>       Value *Address = Deps[i].getAddress();
>>         AvailableValue AV;
>> -    if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>> +    // TODO: We can use anything where the operands are available, and
>> we should
>> +    // learn to recreate operands in other blocks if they are available.
>> +    // Because we don't have the infrastructure in our PRE, we restrict
>> this to
>> +    // global values, because we know the operands are always available.
>> +    if (DepInfo.isNonFuncLocal()) {
>> +      if (isSafeToSpeculativelyExecute(LI) &&
>> +          isa<GlobalValue>(LI->getPointerOperand())) {
>>
> This is overly specific.  This also works for function arguments, etc..
>
>> +        AV = AvailableValue::getCreateLoad(LI);
>> +        ValuesPerBlock.push_back(AvailableValueInBlock::get(
>> +            &LI->getParent()->getParent()->getEntryBlock(),
>> std::move(AV)));
>>
> The entry block may be a horrible place to put this load.  How was this
> heuristic evaluated?
>
>> +      } else
>> +        UnavailableBlocks.push_back(DepBB);
>> +
>> +    } else if (AnalyzeLoadAvailability(LI, DepInfo, Address, AV)) {
>>         // subtlety: because we know this was a non-local dependency, we
>> know
>>         // it's safe to materialize anywhere between the instruction
>> within
>>         // DepInfo and the end of it's block.
>> -      ValuesPerBlock.push_back(AvailableValueInBlock::get(DepBB,
>> -
>> std::move(AV)));
>> +      ValuesPerBlock.push_back(
>> +          AvailableValueInBlock::get(DepBB, std::move(AV)));
>>       } else {
>>         UnavailableBlocks.push_back(DepBB);
>>       }
>>
>> Added: llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transfor
>> ms/GVN/PRE/hoist-loads.ll?rev=284311&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll (added)
>> +++ llvm/trunk/test/Transforms/GVN/PRE/hoist-loads.ll Sat Oct 15
>> 16:35:23 2016
>> @@ -0,0 +1,53 @@
>> +; RUN: opt -gvn %s -S -o - | FileCheck %s
>> +
>> +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>> +target triple = "x86_64-unknown-linux-gnu"
>> +
>> +; Check that the loads of @aaa, @bbb and @ccc are hoisted.
>> +; CHECK-LABEL: define void @foo
>> +; CHECK-NEXT:  %2 = load i32, i32* @ccc, align 4
>> +; CHECK-NEXT:  %3 = load i32, i32* @bbb, align 4
>> +; CHECK-NEXT:  %4 = load i32, i32* @aaa, align 4
>> +
>> + at aaa = local_unnamed_addr global i32 10, align 4
>> + at bbb = local_unnamed_addr global i32 20, align 4
>> + at ccc = local_unnamed_addr global i32 30, align 4
>> +
>> +define void @foo(i32* nocapture readonly) local_unnamed_addr {
>> +  br label %2
>> +
>> +  %.0 = phi i32* [ %0, %1 ], [ %3, %22 ]
>> +  %3 = getelementptr inbounds i32, i32* %.0, i64 1
>> +  %4 = load i32, i32* %.0, align 4
>> +  %5 = load i32, i32* @ccc, align 4
>> +  %6 = icmp sgt i32 %4, %5
>> +  br i1 %6, label %7, label %10
>> +
>> +  %8 = load i32, i32* @bbb, align 4
>> +  %9 = add nsw i32 %8, %4
>> +  store i32 %9, i32* @bbb, align 4
>> +  br label %10
>> +
>> +  %11 = load i32, i32* @bbb, align 4
>> +  %12 = icmp sgt i32 %4, %11
>> +  br i1 %12, label %13, label %16
>> +
>> +  %14 = load i32, i32* @aaa, align 4
>> +  %15 = add nsw i32 %14, %4
>> +  store i32 %15, i32* @aaa, align 4
>> +  br label %16
>> +
>> +  %17 = load i32, i32* @aaa, align 4
>> +  %18 = icmp sgt i32 %4, %17
>> +  br i1 %18, label %19, label %22
>> +
>> +  %20 = load i32, i32* @ccc, align 4
>> +  %21 = add nsw i32 %20, %4
>> +  store i32 %21, i32* @ccc, align 4
>> +  br label %22
>> +
>> +  %23 = icmp slt i32 %4, 0
>> +  br i1 %23, label %24, label %2
>> +
>> +  ret void
>> +}
>>
> Please add several more test cases which exercise the trivial case, a case
> where we *can't* hoist, and a case where we can hoist because the load is
> explicitly marked invariant.
>
>>
>> Modified: llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transfor
>> ms/GVN/PRE/pre-single-pred.ll?rev=284311&r1=284310&r2=284311&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll (original)
>> +++ llvm/trunk/test/Transforms/GVN/PRE/pre-single-pred.ll Sat Oct 15
>> 16:35:23 2016
>> @@ -1,16 +1,9 @@
>>   ; RUN: opt < %s -gvn -enable-load-pre -S | FileCheck %s
>> -; This testcase assumed we'll PRE the load into %for.cond, but we don't
>> actually
>> -; verify that doing so is safe.  If there didn't _happen_ to be a load in
>> -; %for.end, we would actually be lengthening the execution on some
>> paths, and
>> -; we were never actually checking that case.  Now we actually do perform
>> some
>> -; conservative checking to make sure we don't make paths longer, but we
>> don't
>> -; currently get this case, which we got lucky on previously.
>> -;
>> -; Now that that faulty assumption is corrected, test that we DON'T
>> incorrectly
>> -; hoist the load.  Doing the right thing for the wrong reasons is still
>> a bug.
>>     @p = external global i32
>>   define i32 @f(i32 %n) nounwind {
>> +; CHECK: entry:
>> +; CHECK-NEXT: %0 = load i32, i32* @p
>>   entry:
>>         br label %for.cond
>>   @@ -22,8 +15,9 @@ for.cond:           ; preds = %for.inc, %entry
>>   for.cond.for.end_crit_edge:           ; preds = %for.cond
>>         br label %for.end
>>   +; The load of @p should be hoisted into the entry block.
>>   ; CHECK: for.body:
>> -; CHECK-NEXT: %tmp3 = load i32, i32* @p
>> +; CHECK-NEXT: %dec = add i32 %tmp3, -1
>>   for.body:             ; preds = %for.cond
>>         %tmp3 = load i32, i32* @p               ; <i32> [#uses=1]
>>         %dec = add i32 %tmp3, -1                ; <i32> [#uses=2]
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161020/c72d6489/attachment.html>


More information about the llvm-commits mailing list