[llvm] r228024 - Allow PRE to insert no-cost phi nodes

Philip Reames listmail at philipreames.com
Wed Feb 18 10:19:15 PST 2015


Jiangning,

Given Daniel's general schedule, I'm guessing he's busy and isn't going 
to get back to this for a bit.  However, his change is seemingly 
innocuous and beneficial.  I would be very reluctant to see this backed 
out without strong evidence that the approach is flawed.

Have you been able to reduce an example to illustrate what's going 
wrong?  I'm willing to help out here, but I don't have access to the 
hardware in question.

If I had to guess, this change is exposing another problem in the 
backedge - possibly in the register allocator.  My preferred approach 
would be to identify that issue and fix it.  Worth noting is that this 
transform would be easy to undo in the a machine specific pass or code 
gen prep.  That might be a viable strategy as well.

Philip


On 02/16/2015 06:17 PM, Jiangning Liu wrote:
> Ping again...
>
> 2015-02-15 9:26 GMT+08:00 Jiangning Liu <liujiangning1 at gmail.com 
> <mailto:liujiangning1 at gmail.com>>:
>
>     Hi Daniel,
>
>     Ping...
>
>     Thanks,
>     -Jiangning
>
>
>     2015-02-10 14:34 GMT+08:00 Jiangning Liu <liujiangning1 at gmail.com
>     <mailto:liujiangning1 at gmail.com>>:
>
>         Hi Daniel,
>
>         This commit caused ~5% performance regression for
>         spec2006.401.bzip2 on aarch64.
>
>         Don't know the root cause yet, but looking at the binary of
>         decompress.o, it seems the code change is significant.
>
>         Have you noticed the same issue?
>
>         Thanks,
>         -Jiangning
>
>
>
>         2015-02-04 4:37 GMT+08:00 Daniel Berlin <dberlin at dberlin.org
>         <mailto:dberlin at dberlin.org>>:
>
>             Author: dannyb
>             Date: Tue Feb  3 14:37:08 2015
>             New Revision: 228024
>
>             URL: http://llvm.org/viewvc/llvm-project?rev=228024&view=rev
>             Log:
>             Allow PRE to insert no-cost phi nodes
>
>             Added:
>             llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll
>             Modified:
>             llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>
>             Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>             URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=228024&r1=228023&r2=228024&view=diff
>             ==============================================================================
>             --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>             +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Tue Feb  3
>             14:37:08 2015
>             @@ -711,6 +711,8 @@ namespace {
>                  bool iterateOnFunction(Function &F);
>                  bool performPRE(Function &F);
>                  bool performScalarPRE(Instruction *I);
>             +    bool performScalarPREInsertion(Instruction *Instr,
>             BasicBlock *Pred,
>             +  unsigned int ValNo);
>                  Value *findLeader(const BasicBlock *BB, uint32_t num);
>                  void cleanupGlobalSets();
>                  void verifyRemoved(const Instruction *I) const;
>             @@ -2454,6 +2456,43 @@ bool GVN::processBlock(BasicBlock
>             *BB) {
>                return ChangedFunction;
>              }
>
>             +// Instantiate an expression in a predecessor that lacked it.
>             +bool GVN::performScalarPREInsertion(Instruction *Instr,
>             BasicBlock *Pred,
>             + unsigned int ValNo) {
>             +  // Because we are going top-down through the block, all
>             value numbers
>             +  // will be available in the predecessor by the time we
>             need them. Any
>             +  // that weren't originally present will have been
>             instantiated earlier
>             +  // in this loop.
>             +  bool success = true;
>             +  for (unsigned i = 0, e = Instr->getNumOperands(); i !=
>             e; ++i) {
>             +    Value *Op = Instr->getOperand(i);
>             +    if (isa<Argument>(Op) || isa<Constant>(Op) ||
>             isa<GlobalValue>(Op))
>             +      continue;
>             +
>             +    if (Value *V = findLeader(Pred, VN.lookup(Op))) {
>             +      Instr->setOperand(i, V);
>             +    } else {
>             +      success = false;
>             +      break;
>             +    }
>             +  }
>             +
>             +  // Fail out if we encounter an operand that is not
>             available in
>             +  // the PRE predecessor.  This is typically because of
>             loads which
>             +  // are not value numbered precisely.
>             +  if (!success)
>             +    return false;
>             +
>             + Instr->insertBefore(Pred->getTerminator());
>             +  Instr->setName(Instr->getName() + ".pre");
>             + Instr->setDebugLoc(Instr->getDebugLoc());
>             +  VN.add(Instr, ValNo);
>             +
>             +  // Update the availability map to include the new
>             instruction.
>             +  addToLeaderTable(ValNo, Instr, Pred);
>             +  return true;
>             +}
>             +
>              bool GVN::performScalarPRE(Instruction *CurInst) {
>                SmallVector<std::pair<Value*, BasicBlock*>, 8> predMap;
>
>             @@ -2520,60 +2559,43 @@ bool
>             GVN::performScalarPRE(Instruction *
>
>                // Don't do PRE when it might increase code size, i.e. when
>                // we would need to insert instructions in more than
>             one pred.
>             -  if (NumWithout != 1 || NumWith == 0)
>             -    return false;
>             -
>             -  // Don't do PRE across indirect branch.
>             -  if (isa<IndirectBrInst>(PREPred->getTerminator()))
>             +  if (NumWithout > 1 || NumWith == 0)
>                  return false;
>
>             -  // We can't do PRE safely on a critical edge, so
>             instead we schedule
>             -  // the edge to be split and perform the PRE the next
>             time we iterate
>             -  // on the function.
>             -  unsigned SuccNum = GetSuccessorNumber(PREPred,
>             CurrentBlock);
>             -  if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {
>             -
>             toSplit.push_back(std::make_pair(PREPred->getTerminator(),
>             SuccNum));
>             -    return false;
>             -  }
>             +  // We may have a case where all predecessors have the
>             instruction,
>             +  // and we just need to insert a phi node. Otherwise,
>             perform
>             +  // insertion.
>             +  Instruction *PREInstr = nullptr;
>
>             -  // Instantiate the expression in the predecessor that
>             lacked it.
>             -  // Because we are going top-down through the block, all
>             value numbers
>             -  // will be available in the predecessor by the time we
>             need them. Any
>             -  // that weren't originally present will have been
>             instantiated earlier
>             -  // in this loop.
>             -  Instruction *PREInstr = CurInst->clone();
>             -  bool success = true;
>             -  for (unsigned i = 0, e = CurInst->getNumOperands(); i
>             != e; ++i) {
>             -    Value *Op = PREInstr->getOperand(i);
>             -    if (isa<Argument>(Op) || isa<Constant>(Op) ||
>             isa<GlobalValue>(Op))
>             -      continue;
>             +  if (NumWithout != 0) {
>             +    // Don't do PRE across indirect branch.
>             +    if (isa<IndirectBrInst>(PREPred->getTerminator()))
>             +      return false;
>
>             -    if (Value *V = findLeader(PREPred, VN.lookup(Op))) {
>             -      PREInstr->setOperand(i, V);
>             -    } else {
>             -      success = false;
>             -      break;
>             +    // We can't do PRE safely on a critical edge, so
>             instead we schedule
>             +    // the edge to be split and perform the PRE the next
>             time we iterate
>             +    // on the function.
>             +    unsigned SuccNum = GetSuccessorNumber(PREPred,
>             CurrentBlock);
>             +    if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {
>             +
>             toSplit.push_back(std::make_pair(PREPred->getTerminator(),
>             SuccNum));
>             +      return false;
>             +    }
>             +    // We need to insert somewhere, so let's give it a shot
>             +    PREInstr = CurInst->clone();
>             +    if (!performScalarPREInsertion(PREInstr, PREPred,
>             ValNo)) {
>             +      // If we failed insertion, make sure we remove the
>             instruction.
>             +      DEBUG(verifyRemoved(PREInstr));
>             +      delete PREInstr;
>             +      return false;
>                  }
>                }
>
>             -  // Fail out if we encounter an operand that is not
>             available in
>             -  // the PRE predecessor.  This is typically because of
>             loads which
>             -  // are not value numbered precisely.
>             -  if (!success) {
>             -    DEBUG(verifyRemoved(PREInstr));
>             -    delete PREInstr;
>             -    return false;
>             -  }
>             +  // Either we should have filled in the PRE instruction,
>             or we should
>             +  // not have needed insertions.
>             +  assert (PREInstr != nullptr || NumWithout == 0);
>
>             - PREInstr->insertBefore(PREPred->getTerminator());
>             - PREInstr->setName(CurInst->getName() + ".pre");
>             - PREInstr->setDebugLoc(CurInst->getDebugLoc());
>             -  VN.add(PREInstr, ValNo);
>                ++NumGVNPRE;
>
>             -  // Update the availability map to include the new
>             instruction.
>             -  addToLeaderTable(ValNo, PREInstr, PREPred);
>             -
>                // Create a PHI to make the value available in this block.
>                PHINode *Phi =
>              PHINode::Create(CurInst->getType(), predMap.size(),
>             @@ -2609,6 +2631,8 @@ bool GVN::performScalarPRE(Instruction *
>                  MD->removeInstruction(CurInst);
>                DEBUG(verifyRemoved(CurInst));
>                CurInst->eraseFromParent();
>             +  ++NumGVNInstr;
>             +
>                return true;
>              }
>
>
>             Added: llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll
>             URL:
>             http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll?rev=228024&view=auto
>             ==============================================================================
>             --- llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll (added)
>             +++ llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll Tue
>             Feb  3 14:37:08 2015
>             @@ -0,0 +1,31 @@
>             +; RUN: opt < %s -gvn -S | FileCheck %s
>             +; This testcase tests insertion of no-cost phis.  That is,
>             +; when the value is already available in every predecessor,
>             +; and we just need to insert a phi node to merge the
>             available values.
>             +
>             + at c = global i32 0, align 4
>             + at d = global i32 0, align 4
>             +
>             +
>             +define i32 @mai(i32 %foo, i32 %a, i32 %b) {
>             +  %1 = icmp ne i32 %foo, 0
>             +  br i1 %1, label %bb1, label %bb2
>             +
>             +bb1:
>             +  %2 = add nsw i32 %a, %b
>             +  store i32 %2, i32* @c, align 4
>             +  br label %mergeblock
>             +
>             +bb2:
>             +  %3 = add nsw i32 %a, %b
>             +  store i32 %3, i32* @d, align 4
>             +  br label %mergeblock
>             +
>             +mergeblock:
>             +; CHECK: pre-phi = phi i32 [ %3, %bb2 ], [ %2, %bb1 ]
>             +; CHECK-NEXT: ret i32 %.pre-phi
>             +  %4 = add nsw i32 %a, %b
>             +  ret i32 %4
>             +}
>             +
>             +
>
>
>             _______________________________________________
>             llvm-commits mailing list
>             llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>             http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150218/68378681/attachment.html>


More information about the llvm-commits mailing list