<div dir="ltr">Apologies.<div>Due to bad filtering, this wasn't ending up in my inbox, just in my commits email folder, and i didn't notice it until Chandler mentioned it to me . I've fixed this so it shouldn't happen again.<div><br><div><br></div><div>The short version is: There is no way to fix this in PRE. The change simply eliminates a redundant instruction. It performs no additional instruction insertion, etc. It is literally doing nothing but transforming</div><div><br></div><div>if (c) {</div><div> e = a +b </div><div>} <span style="font-size:13.1999998092651px">else {</span></div><div> f = a + b </div><div>}</div><div>g = a+ b</div><div><br></div><div>into </div><div><div style="font-size:13.1999998092651px">if (c) {</div><div style="font-size:13.1999998092651px"> e = a +b </div><div style="font-size:13.1999998092651px">} <span style="font-size:13.1999998092651px">else {</span></div><div style="font-size:13.1999998092651px"> f = a + b </div><div style="font-size:13.1999998092651px">}</div><div style="font-size:13.1999998092651px"> g = phi(e, f)</div></div><div style="font-size:13.1999998092651px"><br></div><div style="font-size:13.1999998092651px">So it's not even *adding* code (and in fact, it makes the live ranges of a and b shorter)</div><div style="font-size:13.1999998092651px"><br></div><div style="font-size:13.1999998092651px">The results for the output should use less registers, not more.</div><div style="font-size:13.1999998092651px"><br></div><div style="font-size:13.1999998092651px">Any normal PRE pass (including our load PRE) will do this, actually. It was just an oversight that the existing scalar one had a wrong conditional.</div><div style="font-size:13.1999998092651px"><br></div><div style="font-size:13.1999998092651px"><span style="font-size:13.1999998092651px">My wild guess is bad register allocation interaction. But if that's the case, we should fix that there, because any improvements we make to GVN/PRE will exacerbate issues there.</span><br></div><div style="font-size:13.1999998092651px"><br></div><div style="font-size:13.1999998092651px">If there is any way to narrow this down a bit, i'd be happy to take a look what is going on (though i don't have easy access to the hardware)</div><div style="font-size:13.1999998092651px"><br></div><div><div class="gmail_quote">On Mon Feb 16 2015 at 6:17:25 PM Jiangning Liu <<a href="mailto:liujiangning1@gmail.com">liujiangning1@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Ping again...</div><div class="gmail_extra"><br><div class="gmail_quote">2015-02-15 9:26 GMT+08:00 Jiangning Liu <span dir="ltr"><<a href="mailto:liujiangning1@gmail.com" target="_blank">liujiangning1@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Daniel,</div><div><br></div>Ping...<div><br></div><div>Thanks,</div><div>-Jiangning</div><div><br></div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">2015-02-10 14:34 GMT+08:00 Jiangning Liu <span dir="ltr"><<a href="mailto:liujiangning1@gmail.com" target="_blank">liujiangning1@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Daniel,<div><br></div><div>This commit caused ~5% performance regression for spec2006.401.bzip2 on aarch64.</div><div><br></div><div>Don't know the root cause yet, but looking at the binary of decompress.o, it seems the code change is significant.</div><div><br></div><div>Have you noticed the same issue?</div><div><br></div><div>Thanks,</div><div>-Jiangning</div><div><br></div><div><br></div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">2015-02-04 4:37 GMT+08:00 Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: dannyb<br>
Date: Tue Feb 3 14:37:08 2015<br>
New Revision: 228024<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=228024&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=228024&view=rev</a><br>
Log:<br>
Allow PRE to insert no-cost phi nodes<br>
<br>
Added:<br>
llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll<br>
Modified:<br>
llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
<br>
Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=228024&r1=228023&r2=228024&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=228024&r1=228023&r2=228024&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Tue Feb 3 14:37:08 2015<br>
@@ -711,6 +711,8 @@ namespace {<br>
bool iterateOnFunction(Function &F);<br>
bool performPRE(Function &F);<br>
bool performScalarPRE(Instruction *I);<br>
+ bool performScalarPREInsertion(Instruction *Instr, BasicBlock *Pred,<br>
+ unsigned int ValNo);<br>
Value *findLeader(const BasicBlock *BB, uint32_t num);<br>
void cleanupGlobalSets();<br>
void verifyRemoved(const Instruction *I) const;<br>
@@ -2454,6 +2456,43 @@ bool GVN::processBlock(BasicBlock *BB) {<br>
return ChangedFunction;<br>
}<br>
<br>
+// Instantiate an expression in a predecessor that lacked it.<br>
+bool GVN::performScalarPREInsertion(Instruction *Instr, BasicBlock *Pred,<br>
+ unsigned int ValNo) {<br>
+ // Because we are going top-down through the block, all value numbers<br>
+ // will be available in the predecessor by the time we need them. Any<br>
+ // that weren't originally present will have been instantiated earlier<br>
+ // in this loop.<br>
+ bool success = true;<br>
+ for (unsigned i = 0, e = Instr->getNumOperands(); i != e; ++i) {<br>
+ Value *Op = Instr->getOperand(i);<br>
+ if (isa<Argument>(Op) || isa<Constant>(Op) || isa<GlobalValue>(Op))<br>
+ continue;<br>
+<br>
+ if (Value *V = findLeader(Pred, VN.lookup(Op))) {<br>
+ Instr->setOperand(i, V);<br>
+ } else {<br>
+ success = false;<br>
+ break;<br>
+ }<br>
+ }<br>
+<br>
+ // Fail out if we encounter an operand that is not available in<br>
+ // the PRE predecessor. This is typically because of loads which<br>
+ // are not value numbered precisely.<br>
+ if (!success)<br>
+ return false;<br>
+<br>
+ Instr->insertBefore(Pred->getTerminator());<br>
+ Instr->setName(Instr->getName() + ".pre");<br>
+ Instr->setDebugLoc(Instr->getDebugLoc());<br>
+ VN.add(Instr, ValNo);<br>
+<br>
+ // Update the availability map to include the new instruction.<br>
+ addToLeaderTable(ValNo, Instr, Pred);<br>
+ return true;<br>
+}<br>
+<br>
bool GVN::performScalarPRE(Instruction *CurInst) {<br>
SmallVector<std::pair<Value*, BasicBlock*>, 8> predMap;<br>
<br>
@@ -2520,60 +2559,43 @@ bool GVN::performScalarPRE(Instruction *<br>
<br>
// Don't do PRE when it might increase code size, i.e. when<br>
// we would need to insert instructions in more than one pred.<br>
- if (NumWithout != 1 || NumWith == 0)<br>
- return false;<br>
-<br>
- // Don't do PRE across indirect branch.<br>
- if (isa<IndirectBrInst>(PREPred->getTerminator()))<br>
+ if (NumWithout > 1 || NumWith == 0)<br>
return false;<br>
<br>
- // We can't do PRE safely on a critical edge, so instead we schedule<br>
- // the edge to be split and perform the PRE the next time we iterate<br>
- // on the function.<br>
- unsigned SuccNum = GetSuccessorNumber(PREPred, CurrentBlock);<br>
- if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {<br>
- toSplit.push_back(std::make_pair(PREPred->getTerminator(), SuccNum));<br>
- return false;<br>
- }<br>
+ // We may have a case where all predecessors have the instruction,<br>
+ // and we just need to insert a phi node. Otherwise, perform<br>
+ // insertion.<br>
+ Instruction *PREInstr = nullptr;<br>
<br>
- // Instantiate the expression in the predecessor that lacked it.<br>
- // Because we are going top-down through the block, all value numbers<br>
- // will be available in the predecessor by the time we need them. Any<br>
- // that weren't originally present will have been instantiated earlier<br>
- // in this loop.<br>
- Instruction *PREInstr = CurInst->clone();<br>
- bool success = true;<br>
- for (unsigned i = 0, e = CurInst->getNumOperands(); i != e; ++i) {<br>
- Value *Op = PREInstr->getOperand(i);<br>
- if (isa<Argument>(Op) || isa<Constant>(Op) || isa<GlobalValue>(Op))<br>
- continue;<br>
+ if (NumWithout != 0) {<br>
+ // Don't do PRE across indirect branch.<br>
+ if (isa<IndirectBrInst>(PREPred->getTerminator()))<br>
+ return false;<br>
<br>
- if (Value *V = findLeader(PREPred, VN.lookup(Op))) {<br>
- PREInstr->setOperand(i, V);<br>
- } else {<br>
- success = false;<br>
- break;<br>
+ // We can't do PRE safely on a critical edge, so instead we schedule<br>
+ // the edge to be split and perform the PRE the next time we iterate<br>
+ // on the function.<br>
+ unsigned SuccNum = GetSuccessorNumber(PREPred, CurrentBlock);<br>
+ if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {<br>
+ toSplit.push_back(std::make_pair(PREPred->getTerminator(), SuccNum));<br>
+ return false;<br>
+ }<br>
+ // We need to insert somewhere, so let's give it a shot<br>
+ PREInstr = CurInst->clone();<br>
+ if (!performScalarPREInsertion(PREInstr, PREPred, ValNo)) {<br>
+ // If we failed insertion, make sure we remove the instruction.<br>
+ DEBUG(verifyRemoved(PREInstr));<br>
+ delete PREInstr;<br>
+ return false;<br>
}<br>
}<br>
<br>
- // Fail out if we encounter an operand that is not available in<br>
- // the PRE predecessor. This is typically because of loads which<br>
- // are not value numbered precisely.<br>
- if (!success) {<br>
- DEBUG(verifyRemoved(PREInstr));<br>
- delete PREInstr;<br>
- return false;<br>
- }<br>
+ // Either we should have filled in the PRE instruction, or we should<br>
+ // not have needed insertions.<br>
+ assert (PREInstr != nullptr || NumWithout == 0);<br>
<br>
- PREInstr->insertBefore(PREPred->getTerminator());<br>
- PREInstr->setName(CurInst->getName() + ".pre");<br>
- PREInstr->setDebugLoc(CurInst->getDebugLoc());<br>
- VN.add(PREInstr, ValNo);<br>
++NumGVNPRE;<br>
<br>
- // Update the availability map to include the new instruction.<br>
- addToLeaderTable(ValNo, PREInstr, PREPred);<br>
-<br>
// Create a PHI to make the value available in this block.<br>
PHINode *Phi =<br>
PHINode::Create(CurInst->getType(), predMap.size(),<br>
@@ -2609,6 +2631,8 @@ bool GVN::performScalarPRE(Instruction *<br>
MD->removeInstruction(CurInst);<br>
DEBUG(verifyRemoved(CurInst));<br>
CurInst->eraseFromParent();<br>
+ ++NumGVNInstr;<br>
+<br>
return true;<br>
}<br>
<br>
<br>
Added: llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll?rev=228024&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll?rev=228024&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll (added)<br>
+++ llvm/trunk/test/Transforms/GVN/pre-no-cost-phi.ll Tue Feb 3 14:37:08 2015<br>
@@ -0,0 +1,31 @@<br>
+; RUN: opt < %s -gvn -S | FileCheck %s<br>
+; This testcase tests insertion of no-cost phis. That is,<br>
+; when the value is already available in every predecessor,<br>
+; and we just need to insert a phi node to merge the available values.<br>
+<br>
+@c = global i32 0, align 4<br>
+@d = global i32 0, align 4<br>
+<br>
+<br>
+define i32 @mai(i32 %foo, i32 %a, i32 %b) {<br>
+ %1 = icmp ne i32 %foo, 0<br>
+ br i1 %1, label %bb1, label %bb2<br>
+<br>
+bb1:<br>
+ %2 = add nsw i32 %a, %b<br>
+ store i32 %2, i32* @c, align 4<br>
+ br label %mergeblock<br>
+<br>
+bb2:<br>
+ %3 = add nsw i32 %a, %b<br>
+ store i32 %3, i32* @d, align 4<br>
+ br label %mergeblock<br>
+<br>
+mergeblock:<br>
+; CHECK: pre-phi = phi i32 [ %3, %bb2 ], [ %2, %bb1 ]<br>
+; CHECK-NEXT: ret i32 %.pre-phi<br>
+ %4 = add nsw i32 %a, %b<br>
+ ret i32 %4<br>
+}<br>
+<br>
+<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</blockquote></div></div></div></div></div>