<div dir="ltr">yes, it is likely that it exposes bugs in downstream component -- such as increased register pressure etc.<div><br></div><div>David</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Feb 18, 2015 at 10:19 AM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div>Jiangning,<br>
      <br>
      Given Daniel's general schedule, I'm guessing he's busy and isn't
      going to get back to this for a bit.  However, his change is
      seemingly innocuous and beneficial.  I would be very reluctant to
      see this backed out without strong evidence that the approach is
      flawed.  <br>
      <br>
      Have you been able to reduce an example to illustrate what's going
      wrong?  I'm willing to help out here, but I don't have access to
      the hardware in question.  <br>
      <br>
      If I had to guess, this change is exposing another problem in the
      backedge - possibly in the register allocator.  My preferred
      approach would be to identify that issue and fix it.  Worth noting
      is that this transform would be easy to undo in the a machine
      specific pass or code gen prep.  That might be a viable strategy
      as well.  <br><span class="HOEnZb"><font color="#888888">
      <br>
      Philip</font></span><div><div class="h5"><br>
      <br>
      <br>
      On 02/16/2015 06:17 PM, Jiangning Liu wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">Ping again...</div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">2015-02-15 9:26 GMT+08:00 Jiangning Liu
          <span dir="ltr"><<a href="mailto:liujiangning1@gmail.com" target="_blank">liujiangning1@gmail.com</a>></span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">
              <div>Hi Daniel,</div>
              <div><br>
              </div>
              Ping...
              <div><br>
              </div>
              <div>Thanks,</div>
              <div>-Jiangning</div>
              <div><br>
              </div>
            </div>
            <div>
              <div>
                <div class="gmail_extra"><br>
                  <div class="gmail_quote">2015-02-10 14:34 GMT+08:00
                    Jiangning Liu <span dir="ltr"><<a href="mailto:liujiangning1@gmail.com" target="_blank">liujiangning1@gmail.com</a>></span>:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div dir="ltr">Hi Daniel,
                        <div><br>
                        </div>
                        <div>This commit caused ~5% performance
                          regression for spec2006.401.bzip2 on aarch64.</div>
                        <div><br>
                        </div>
                        <div>Don't know the root cause yet, but looking
                          at the binary of decompress.o, it seems the
                          code change is significant.</div>
                        <div><br>
                        </div>
                        <div>Have you noticed the same issue?</div>
                        <div><br>
                        </div>
                        <div>Thanks,</div>
                        <div>-Jiangning</div>
                        <div><br>
                        </div>
                        <div><br>
                        </div>
                      </div>
                      <div>
                        <div>
                          <div class="gmail_extra"><br>
                            <div class="gmail_quote">2015-02-04 4:37
                              GMT+08:00 Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span>:<br>
                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author:
                                dannyb<br>
                                Date: Tue Feb  3 14:37:08 2015<br>
                                New Revision: 228024<br>
                                <br>
                                URL: <a href="http://llvm.org/viewvc/llvm-project?rev=228024&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=228024&view=rev</a><br>
                                Log:<br>
                                Allow PRE to insert no-cost phi nodes<br>
                                <br>
                                Added:<br>
                                   
                                llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll<br>
                                Modified:<br>
                                   
                                llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
                                <br>
                                Modified:
                                llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
                                URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=228024&r1=228023&r2=228024&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=228024&r1=228023&r2=228024&view=diff</a><br>
==============================================================================<br>
                                ---
                                llvm/trunk/lib/Transforms/Scalar/GVN.cpp
                                (original)<br>
                                +++
                                llvm/trunk/lib/Transforms/Scalar/GVN.cpp
                                Tue Feb  3 14:37:08 2015<br>
                                @@ -711,6 +711,8 @@ namespace {<br>
                                     bool iterateOnFunction(Function
                                &F);<br>
                                     bool performPRE(Function &F);<br>
                                     bool performScalarPRE(Instruction
                                *I);<br>
                                +    bool
                                performScalarPREInsertion(Instruction
                                *Instr, BasicBlock *Pred,<br>
                                +                                 
                                 unsigned int ValNo);<br>
                                     Value *findLeader(const BasicBlock
                                *BB, uint32_t num);<br>
                                     void cleanupGlobalSets();<br>
                                     void verifyRemoved(const
                                Instruction *I) const;<br>
                                @@ -2454,6 +2456,43 @@ bool
                                GVN::processBlock(BasicBlock *BB) {<br>
                                   return ChangedFunction;<br>
                                 }<br>
                                <br>
                                +// Instantiate an expression in a
                                predecessor that lacked it.<br>
                                +bool
                                GVN::performScalarPREInsertion(Instruction
                                *Instr, BasicBlock *Pred,<br>
                                +                                   
                                unsigned int ValNo) {<br>
                                +  // Because we are going top-down
                                through the block, all value numbers<br>
                                +  // will be available in the
                                predecessor by the time we need them. 
                                Any<br>
                                +  // that weren't originally present
                                will have been instantiated earlier<br>
                                +  // in this loop.<br>
                                +  bool success = true;<br>
                                +  for (unsigned i = 0, e =
                                Instr->getNumOperands(); i != e; ++i)
                                {<br>
                                +    Value *Op =
                                Instr->getOperand(i);<br>
                                +    if (isa<Argument>(Op) ||
                                isa<Constant>(Op) ||
                                isa<GlobalValue>(Op))<br>
                                +      continue;<br>
                                +<br>
                                +    if (Value *V = findLeader(Pred,
                                VN.lookup(Op))) {<br>
                                +      Instr->setOperand(i, V);<br>
                                +    } else {<br>
                                +      success = false;<br>
                                +      break;<br>
                                +    }<br>
                                +  }<br>
                                +<br>
                                +  // Fail out if we encounter an
                                operand that is not available in<br>
                                +  // the PRE predecessor.  This is
                                typically because of loads which<br>
                                +  // are not value numbered precisely.<br>
                                +  if (!success)<br>
                                +    return false;<br>
                                +<br>
                                + 
                                Instr->insertBefore(Pred->getTerminator());<br>
                                +  Instr->setName(Instr->getName()
                                + ".pre");<br>
                                + 
                                Instr->setDebugLoc(Instr->getDebugLoc());<br>
                                +  VN.add(Instr, ValNo);<br>
                                +<br>
                                +  // Update the availability map to
                                include the new instruction.<br>
                                +  addToLeaderTable(ValNo, Instr, Pred);<br>
                                +  return true;<br>
                                +}<br>
                                +<br>
                                 bool GVN::performScalarPRE(Instruction
                                *CurInst) {<br>
                                   SmallVector<std::pair<Value*,
                                BasicBlock*>, 8> predMap;<br>
                                <br>
                                @@ -2520,60 +2559,43 @@ bool
                                GVN::performScalarPRE(Instruction *<br>
                                <br>
                                   // Don't do PRE when it might
                                increase code size, i.e. when<br>
                                   // we would need to insert
                                instructions in more than one pred.<br>
                                -  if (NumWithout != 1 || NumWith == 0)<br>
                                -    return false;<br>
                                -<br>
                                -  // Don't do PRE across indirect
                                branch.<br>
                                -  if
                                (isa<IndirectBrInst>(PREPred->getTerminator()))<br>
                                +  if (NumWithout > 1 || NumWith ==
                                0)<br>
                                     return false;<br>
                                <br>
                                -  // We can't do PRE safely on a
                                critical edge, so instead we schedule<br>
                                -  // the edge to be split and perform
                                the PRE the next time we iterate<br>
                                -  // on the function.<br>
                                -  unsigned SuccNum =
                                GetSuccessorNumber(PREPred,
                                CurrentBlock);<br>
                                -  if
                                (isCriticalEdge(PREPred->getTerminator(),
                                SuccNum)) {<br>
                                -   
                                toSplit.push_back(std::make_pair(PREPred->getTerminator(),
                                SuccNum));<br>
                                -    return false;<br>
                                -  }<br>
                                +  // We may have a case where all
                                predecessors have the instruction,<br>
                                +  // and we just need to insert a phi
                                node. Otherwise, perform<br>
                                +  // insertion.<br>
                                +  Instruction *PREInstr = nullptr;<br>
                                <br>
                                -  // Instantiate the expression in the
                                predecessor that lacked it.<br>
                                -  // Because we are going top-down
                                through the block, all value numbers<br>
                                -  // will be available in the
                                predecessor by the time we need them. 
                                Any<br>
                                -  // that weren't originally present
                                will have been instantiated earlier<br>
                                -  // in this loop.<br>
                                -  Instruction *PREInstr =
                                CurInst->clone();<br>
                                -  bool success = true;<br>
                                -  for (unsigned i = 0, e =
                                CurInst->getNumOperands(); i != e;
                                ++i) {<br>
                                -    Value *Op =
                                PREInstr->getOperand(i);<br>
                                -    if (isa<Argument>(Op) ||
                                isa<Constant>(Op) ||
                                isa<GlobalValue>(Op))<br>
                                -      continue;<br>
                                +  if (NumWithout != 0) {<br>
                                +    // Don't do PRE across indirect
                                branch.<br>
                                +    if
                                (isa<IndirectBrInst>(PREPred->getTerminator()))<br>
                                +      return false;<br>
                                <br>
                                -    if (Value *V = findLeader(PREPred,
                                VN.lookup(Op))) {<br>
                                -      PREInstr->setOperand(i, V);<br>
                                -    } else {<br>
                                -      success = false;<br>
                                -      break;<br>
                                +    // We can't do PRE safely on a
                                critical edge, so instead we schedule<br>
                                +    // the edge to be split and perform
                                the PRE the next time we iterate<br>
                                +    // on the function.<br>
                                +    unsigned SuccNum =
                                GetSuccessorNumber(PREPred,
                                CurrentBlock);<br>
                                +    if
                                (isCriticalEdge(PREPred->getTerminator(),
                                SuccNum)) {<br>
                                +     
                                toSplit.push_back(std::make_pair(PREPred->getTerminator(),
                                SuccNum));<br>
                                +      return false;<br>
                                +    }<br>
                                +    // We need to insert somewhere, so
                                let's give it a shot<br>
                                +    PREInstr = CurInst->clone();<br>
                                +    if
                                (!performScalarPREInsertion(PREInstr,
                                PREPred, ValNo)) {<br>
                                +      // If we failed insertion, make
                                sure we remove the instruction.<br>
                                +      DEBUG(verifyRemoved(PREInstr));<br>
                                +      delete PREInstr;<br>
                                +      return false;<br>
                                     }<br>
                                   }<br>
                                <br>
                                -  // Fail out if we encounter an
                                operand that is not available in<br>
                                -  // the PRE predecessor.  This is
                                typically because of loads which<br>
                                -  // are not value numbered precisely.<br>
                                -  if (!success) {<br>
                                -    DEBUG(verifyRemoved(PREInstr));<br>
                                -    delete PREInstr;<br>
                                -    return false;<br>
                                -  }<br>
                                +  // Either we should have filled in
                                the PRE instruction, or we should<br>
                                +  // not have needed insertions.<br>
                                +  assert (PREInstr != nullptr ||
                                NumWithout == 0);<br>
                                <br>
                                - 
                                PREInstr->insertBefore(PREPred->getTerminator());<br>
                                - 
                                PREInstr->setName(CurInst->getName()
                                + ".pre");<br>
                                - 
                                PREInstr->setDebugLoc(CurInst->getDebugLoc());<br>
                                -  VN.add(PREInstr, ValNo);<br>
                                   ++NumGVNPRE;<br>
                                <br>
                                -  // Update the availability map to
                                include the new instruction.<br>
                                -  addToLeaderTable(ValNo, PREInstr,
                                PREPred);<br>
                                -<br>
                                   // Create a PHI to make the value
                                available in this block.<br>
                                   PHINode *Phi =<br>
                                     
                                 PHINode::Create(CurInst->getType(),
                                predMap.size(),<br>
                                @@ -2609,6 +2631,8 @@ bool
                                GVN::performScalarPRE(Instruction *<br>
                                     MD->removeInstruction(CurInst);<br>
                                   DEBUG(verifyRemoved(CurInst));<br>
                                   CurInst->eraseFromParent();<br>
                                +  ++NumGVNInstr;<br>
                                +<br>
                                   return true;<br>
                                 }<br>
                                <br>
                                <br>
                                Added:
                                llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll<br>
                                URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll?rev=228024&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll?rev=228024&view=auto</a><br>
==============================================================================<br>
                                ---
                                llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll
                                (added)<br>
                                +++
                                llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll
                                Tue Feb  3 14:37:08 2015<br>
                                @@ -0,0 +1,31 @@<br>
                                +; RUN: opt < %s -gvn -S | FileCheck
                                %s<br>
                                +; This testcase tests insertion of
                                no-cost phis.  That is,<br>
                                +; when the value is already available
                                in every predecessor,<br>
                                +; and we just need to insert a phi node
                                to merge the available values.<br>
                                +<br>
                                +@c = global i32 0, align 4<br>
                                +@d = global i32 0, align 4<br>
                                +<br>
                                +<br>
                                +define i32 @mai(i32 %foo, i32 %a, i32
                                %b) {<br>
                                +  %1 = icmp ne i32 %foo, 0<br>
                                +  br i1 %1, label %bb1, label %bb2<br>
                                +<br>
                                +bb1:<br>
                                +  %2 = add nsw i32 %a, %b<br>
                                +  store i32 %2, i32* @c, align 4<br>
                                +  br label %mergeblock<br>
                                +<br>
                                +bb2:<br>
                                +  %3 = add nsw i32 %a, %b<br>
                                +  store i32 %3, i32* @d, align 4<br>
                                +  br label %mergeblock<br>
                                +<br>
                                +mergeblock:<br>
                                +; CHECK: pre-phi = phi i32 [ %3, %bb2
                                ], [ %2, %bb1 ]<br>
                                +; CHECK-NEXT: ret i32 %.pre-phi<br>
                                +  %4 = add nsw i32 %a, %b<br>
                                +  ret i32 %4<br>
                                +}<br>
                                +<br>
                                +<br>
                                <br>
                                <br>
_______________________________________________<br>
                                llvm-commits mailing list<br>
                                <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
                                <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
                              </blockquote>
                            </div>
                            <br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
llvm-commits mailing list
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a>
</pre>
    </blockquote>
    <br>
  </div></div></div>

<br>_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br></blockquote></div><br></div>