<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 12, 2016 at 6:42 PM, Michael Zolotukhin via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Author: mzolotukhin<br>
Date: Thu May 12 20:42:39 2016<br>
New Revision: 269388<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=269388&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=269388&view=rev</a><br>
Log:<br>
[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...<br>
<br>
Summary:<br>
...loop after the last iteration.<br>
<br>
This is really hard to do correctly. The core problem is that we need to<br>
model liveness through the induction PHIs from iteration to iteration in<br>
order to get the correct results, and we need to correctly de-duplicate<br>
the common subgraphs of instructions feeding some subset of the<br>
induction PHIs. All of this can be driven either from a side effect at<br>
some iteration or from the loop values used after the loop finishes.<br>
<br>
This patch implements this by storing the forward-propagating analysis<br>
of each instruction in a cache to recall whether it was free and whether<br>
it has become live and thus counted toward the total unroll cost. Then,<br>
at each sink for a value in the loop, we recursively walk back through<br>
every value that feeds the sink, including looping back through the<br>
iterations as needed, until we have marked the entire input graph as<br>
live. Because we cache this, we never visit instructions more than twice<br>
-- once when we analyze them and put them into the cache, and once when<br>
we count their cost towards the unrolled loop. Also, because the cache<br>
is only two bits and because we are dealing with relatively small<br>
iteration counts, we can store all of this very densely in memory to<br>
avoid this from becoming an excessively slow analysis.<br>
<br>
The code here is still pretty gross. I would appreciate suggestions<br>
about better ways to factor or split this up, I've stared too long at<br>
the algorithmic side to really have a good sense of what the design<br>
should probably look at.<br>
<br>
Also, it might seem like we should do all of this bottom-up, but I think<br>
that is a red herring. Specifically, the simplification power is *much*<br>
greater working top-down. We can forward propagate very effectively,<br>
even across strange and interesting recurrances around the backedge.<br>
Because we use data to propagate, this doesn't cause a state space<br>
explosion. Doing this level of constant folding, etc, would be very<br>
expensive to do bottom-up because it wouldn't be until the last moment<br>
that you could collapse everything. The current solution is essentially<br>
a top-down simplification with a bottom-up cost accounting which seems<br>
to get the best of both worlds. It makes the simplification incremental<br>
and powerful while leaving everything dead until we *know* it is needed.<br>
<br>
Finally, a core property of this approach is its *monotonicity*. At all<br>
times, the current UnrolledCost is a conservatively low estimate. This<br>
ensures that we will never early-exit from the analysis due to exceeding<br>
a threshold when if we had continued, the cost would have gone back<br>
below the threshold. These kinds of bugs can cause incredibly hard to<br>
track down random changes to behavior.<br>
<br>
We could use a techinque similar (but much simpler) within the inliner<br>
as well to avoid considering speculated code in the inline cost.<br>
<br>
Reviewers: chandlerc<br>
<br>
Subscribers: sanjoy, mzolotukhin, llvm-commits<br>
<br>
Differential Revision: <a href="http://reviews.llvm.org/D11758" rel="noreferrer" target="_blank">http://reviews.llvm.org/D11758</a><br>
<br>
Added:<br>
llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll<br>
Modified:<br>
llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h<br>
llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp<br>
llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>
llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll<br>
llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll<br>
llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp<br>
<br>
Modified: llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h (original)<br>
+++ llvm/trunk/include/llvm/Analysis/LoopUnrollAnalyzer.h Thu May 12 20:42:39 2016<br>
@@ -89,6 +89,7 @@ private:<br>
bool visitLoad(LoadInst &I);<br>
bool visitCastInst(CastInst &I);<br>
bool visitCmpInst(CmpInst &I);<br>
+ bool visitPHINode(PHINode &PN);<br>
};<br>
}<br>
#endif<br>
<br>
Modified: llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp (original)<br>
+++ llvm/trunk/lib/Analysis/LoopUnrollAnalyzer.cpp Thu May 12 20:42:39 2016<br>
@@ -189,3 +189,13 @@ bool UnrolledInstAnalyzer::visitCmpInst(<br>
<br>
return Base::visitCmpInst(I);<br>
}<br>
+<br>
+bool UnrolledInstAnalyzer::visitPHINode(PHINode &PN) {<br>
+ // Run base visitor first. This way we can gather some useful for later<br>
+ // analysis information.<br>
+ if (Base::visitPHINode(PN))<br>
+ return true;<br>
+<br>
+ // The loop induction PHI nodes are definitionally free.<br>
+ return PN.getParent() == L->getHeader();<br>
+}<br>
<br>
Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp Thu May 12 20:42:39 2016<br>
@@ -185,6 +185,40 @@ static TargetTransformInfo::UnrollingPre<br>
}<br>
<br>
namespace {<br>
+/// A struct to densely store the state of an instruction after unrolling at<br>
+/// each iteration.<br>
+///<br>
+/// This is designed to work like a tuple of <Instruction *, int> for the<br>
+/// purposes of hashing and lookup, but to be able to associate two boolean<br>
+/// states with each key.<br>
+struct UnrolledInstState {<br>
+ Instruction *I;<br>
+ int Iteration : 30;<br>
+ unsigned IsFree : 1;<br>
+ unsigned IsCounted : 1;<br>
+};<br>
+<br>
+/// Hashing and equality testing for a set of the instruction states.<br>
+struct UnrolledInstStateKeyInfo {<br>
+ typedef DenseMapInfo<Instruction *> PtrInfo;<br>
+ typedef DenseMapInfo<std::pair<Instruction *, int>> PairInfo;<br>
+ static inline UnrolledInstState getEmptyKey() {<br>
+ return {PtrInfo::getEmptyKey(), 0, 0, 0};<br>
+ }<br>
+ static inline UnrolledInstState getTombstoneKey() {<br>
+ return {PtrInfo::getTombstoneKey(), 0, 0, 0};<br>
+ }<br>
+ static inline unsigned getHashValue(const UnrolledInstState &S) {<br>
+ return PairInfo::getHashValue({S.I, S.Iteration});<br>
+ }<br>
+ static inline bool isEqual(const UnrolledInstState &LHS,<br>
+ const UnrolledInstState &RHS) {<br>
+ return PairInfo::isEqual({LHS.I, LHS.Iteration}, {RHS.I, RHS.Iteration});<br>
+ }<br>
+};<br>
+}<br>
+<br>
+namespace {<br>
struct EstimatedUnrollCost {<br>
/// \brief The estimated cost after unrolling.<br>
int UnrolledCost;<br>
@@ -218,18 +252,25 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
assert(UnrollMaxIterationsCountToAnalyze < (INT_MAX / 2) &&<br>
"The unroll iterations max is too large!");<br>
<br>
+ // Only analyze inner loops. We can't properly estimate cost of nested loops<br>
+ // and we won't visit inner loops again anyway.<br>
+ if (!L->empty())<br>
+ return None;<br>
+<br>
// Don't simulate loops with a big or unknown tripcount<br>
if (!UnrollMaxIterationsCountToAnalyze || !TripCount ||<br>
TripCount > UnrollMaxIterationsCountToAnalyze)<br>
return None;<br>
<br>
SmallSetVector<BasicBlock *, 16> BBWorklist;<br>
+ SmallSetVector<std::pair<BasicBlock *, BasicBlock *>, 4> ExitWorklist;<br>
DenseMap<Value *, Constant *> SimplifiedValues;<br>
SmallVector<std::pair<Value *, Constant *>, 4> SimplifiedInputValues;<br>
<br>
// The estimated cost of the unrolled form of the loop. We try to estimate<br>
// this by simplifying as much as we can while computing the estimate.<br>
int UnrolledCost = 0;<br>
+<br>
// We also track the estimated dynamic (that is, actually executed) cost in<br>
// the rolled form. This helps identify cases when the savings from unrolling<br>
// aren't just exposing dead control flows, but actual reduced dynamic<br>
@@ -237,6 +278,97 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
// unrolling.<br>
int RolledDynamicCost = 0;<br>
<br>
+ // We track the simplification of each instruction in each iteration. We use<br>
+ // this to recursively merge costs into the unrolled cost on-demand so that<br>
+ // we don't count the cost of any dead code. This is essentially a map from<br>
+ // <instruction, int> to <bool, bool>, but stored as a densely packed struct.<br>
+ DenseSet<UnrolledInstState, UnrolledInstStateKeyInfo> InstCostMap;<br>
+<br>
+ // A small worklist used to accumulate cost of instructions from each<br>
+ // observable and reached root in the loop.<br>
+ SmallVector<Instruction *, 16> CostWorklist;<br>
+<br>
+ // PHI-used worklist used between iterations while accumulating cost.<br>
+ SmallVector<Instruction *, 4> PHIUsedList;<br>
+<br>
+ // Helper function to accumulate cost for instructions in the loop.<br>
+ auto AddCostRecursively = [&](Instruction &RootI, int Iteration) {<br>
+ assert(Iteration >= 0 && "Cannot have a negative iteration!");<br>
+ assert(CostWorklist.empty() && "Must start with an empty cost list");<br>
+ assert(PHIUsedList.empty() && "Must start with an empty phi used list");<br>
+ CostWorklist.push_back(&RootI);<br>
+ for (;; --Iteration) {<br>
+ do {<br>
+ Instruction *I = CostWorklist.pop_back_val();<br>
+<br>
+ // InstCostMap only uses I and Iteration as a key, the other two values<br>
+ // don't matter here.<br>
+ auto CostIter = InstCostMap.find({I, Iteration, 0, 0});<br>
+ if (CostIter == InstCostMap.end())<br>
+ // If an input to a PHI node comes from a dead path through the loop<br>
+ // we may have no cost data for it here. What that actually means is<br>
+ // that it is free.<br>
+ continue;<br>
+ auto &Cost = *CostIter;<br>
+ if (Cost.IsCounted)<br>
+ // Already counted this instruction.<br>
+ continue;<br>
+<br>
+ // Mark that we are counting the cost of this instruction now.<br>
+ Cost.IsCounted = true;<br>
+<br>
+ // If this is a PHI node in the loop header, just add it to the PHI set.<br>
+ if (auto *PhiI = dyn_cast<PHINode>(I))<br>
+ if (PhiI->getParent() == L->getHeader()) {<br>
+ assert(Cost.IsFree && "Loop PHIs shouldn't be evaluated as they "<br>
+ "inherently simplify during unrolling.");<br></blockquote><div><br></div><div>This assertion seems to be failing:</div><div><a href="http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/5255/steps/test/logs/stdio">http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/5255/steps/test/logs/stdio</a><br></div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
+ if (Iteration == 0)<br>
+ continue;<br>
+<br>
+ // Push the incoming value from the backedge into the PHI used list<br>
+ // if it is an in-loop instruction. We'll use this to populate the<br>
+ // cost worklist for the next iteration (as we count backwards).<br>
+ if (auto *OpI = dyn_cast<Instruction>(<br>
+ PhiI->getIncomingValueForBlock(L->getLoopLatch())))<br>
+ if (L->contains(OpI))<br>
+ PHIUsedList.push_back(OpI);<br>
+ continue;<br>
+ }<br>
+<br>
+ // First accumulate the cost of this instruction.<br>
+ if (!Cost.IsFree) {<br>
+ UnrolledCost += TTI.getUserCost(I);<br>
+ DEBUG(dbgs() << "Adding cost of instruction (iteration " << Iteration<br>
+ << "): ");<br>
+ DEBUG(I->dump());<br>
+ }<br>
+<br>
+ // We must count the cost of every operand which is not free,<br>
+ // recursively. If we reach a loop PHI node, simply add it to the set<br>
+ // to be considered on the next iteration (backwards!).<br>
+ for (Value *Op : I->operands()) {<br>
+ // Check whether this operand is free due to being a constant or<br>
+ // outside the loop.<br>
+ auto *OpI = dyn_cast<Instruction>(Op);<br>
+ if (!OpI || !L->contains(OpI))<br>
+ continue;<br>
+<br>
+ // Otherwise accumulate its cost.<br>
+ CostWorklist.push_back(OpI);<br>
+ }<br>
+ } while (!CostWorklist.empty());<br>
+<br>
+ if (PHIUsedList.empty())<br>
+ // We've exhausted the search.<br>
+ break;<br>
+<br>
+ assert(Iteration > 0 &&<br>
+ "Cannot track PHI-used values past the first iteration!");<br>
+ CostWorklist.append(PHIUsedList.begin(), PHIUsedList.end());<br>
+ PHIUsedList.clear();<br>
+ }<br>
+ };<br>
+<br>
// Ensure that we don't violate the loop structure invariants relied on by<br>
// this analysis.<br>
assert(L->isLoopSimplifyForm() && "Must put loop into normal form first.");<br>
@@ -291,22 +423,32 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
// it. We don't change the actual IR, just count optimization<br>
// opportunities.<br>
for (Instruction &I : *BB) {<br>
- int InstCost = TTI.getUserCost(&I);<br>
+ // Track this instruction's expected baseline cost when executing the<br>
+ // rolled loop form.<br>
+ RolledDynamicCost += TTI.getUserCost(&I);<br>
<br>
// Visit the instruction to analyze its loop cost after unrolling,<br>
- // and if the visitor returns false, include this instruction in the<br>
- // unrolled cost.<br>
- if (!Analyzer.visit(I))<br>
- UnrolledCost += InstCost;<br>
- else {<br>
- DEBUG(dbgs() << " " << I<br>
- << " would be simplified if loop is unrolled.\n");<br>
- (void)0;<br>
- }<br>
+ // and if the visitor returns true, mark the instruction as free after<br>
+ // unrolling and continue.<br>
+ bool IsFree = Analyzer.visit(I);<br>
+ bool Inserted = InstCostMap.insert({&I, (int)Iteration, IsFree,<br>
+ /*IsCounted*/ false})<br>
+ .second;<br>
+ (void)Inserted;<br>
+ assert(Inserted && "Cannot have a state for an unvisited instruction!");<br>
<br>
- // Also track this instructions expected cost when executing the rolled<br>
- // loop form.<br>
- RolledDynamicCost += InstCost;<br>
+ if (IsFree)<br>
+ continue;<br>
+<br>
+ // If the instruction might have a side-effect recursively account for<br>
+ // the cost of it and all the instructions leading up to it.<br>
+ if (I.mayHaveSideEffects())<br>
+ AddCostRecursively(I, Iteration);<br>
+<br>
+ // Can't properly model a cost of a call.<br>
+ // FIXME: With a proper cost model we should be able to do it.<br>
+ if(isa<CallInst>(&I))<br>
+ return None;<br>
<br>
// If unrolled body turns out to be too big, bail out.<br>
if (UnrolledCost > MaxUnrolledLoopSize) {<br>
@@ -335,6 +477,8 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
cast<ConstantInt>(SimpleCond)->isZero() ? 1 : 0);<br>
if (L->contains(Succ))<br>
BBWorklist.insert(Succ);<br>
+ else<br>
+ ExitWorklist.insert({BB, Succ});<br>
continue;<br>
}<br>
}<br>
@@ -350,6 +494,8 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
.getCaseSuccessor();<br>
if (L->contains(Succ))<br>
BBWorklist.insert(Succ);<br>
+ else<br>
+ ExitWorklist.insert({BB, Succ});<br>
continue;<br>
}<br>
}<br>
@@ -358,6 +504,8 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
for (BasicBlock *Succ : successors(BB))<br>
if (L->contains(Succ))<br>
BBWorklist.insert(Succ);<br>
+ else<br>
+ ExitWorklist.insert({BB, Succ});<br>
}<br>
<br>
// If we found no optimization opportunities on the first iteration, we<br>
@@ -368,6 +516,23 @@ analyzeLoopUnrollCost(const Loop *L, uns<br>
return None;<br>
}<br>
}<br>
+<br>
+ while (!ExitWorklist.empty()) {<br>
+ BasicBlock *ExitingBB, *ExitBB;<br>
+ std::tie(ExitingBB, ExitBB) = ExitWorklist.pop_back_val();<br>
+<br>
+ for (Instruction &I : *ExitBB) {<br>
+ auto *PN = dyn_cast<PHINode>(&I);<br>
+ if (!PN)<br>
+ break;<br>
+<br>
+ Value *Op = PN->getIncomingValueForBlock(ExitingBB);<br>
+ if (auto *OpI = dyn_cast<Instruction>(Op))<br>
+ if (L->contains(OpI))<br>
+ AddCostRecursively(*OpI, TripCount - 1);<br>
+ }<br>
+ }<br>
+<br>
DEBUG(dbgs() << "Analysis finished:\n"<br>
<< "UnrolledCost: " << UnrolledCost << ", "<br>
<< "RolledDynamicCost: " << RolledDynamicCost << "\n");<br>
<br>
Modified: llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll (original)<br>
+++ llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-2.ll Thu May 12 20:42:39 2016<br>
@@ -1,4 +1,4 @@<br>
-; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=50 -unroll-dynamic-cost-savings-discount=90 | FileCheck %s<br>
+; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=70 -unroll-dynamic-cost-savings-discount=90 | FileCheck %s<br>
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"<br>
<br>
@unknown_global = internal unnamed_addr global [9 x i32] [i32 0, i32 -1, i32 0, i32 -1, i32 5, i32 -1, i32 0, i32 -1, i32 0], align 16<br>
<br>
Added: llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll?rev=269388&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll?rev=269388&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll (added)<br>
+++ llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-dce.ll Thu May 12 20:42:39 2016<br>
@@ -0,0 +1,38 @@<br>
+; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=100 -unroll-dynamic-cost-savings-discount=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=60 | FileCheck %s<br>
+target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"<br>
+<br>
+@known_constant = internal unnamed_addr constant [10 x i32] [i32 0, i32 0, i32 0, i32 0, i32 1, i32 0, i32 0, i32 0, i32 0, i32 0], align 16<br>
+<br>
+; If a load becomes a constant after loop unrolling, we sometimes can simplify<br>
+; CFG. This test verifies that we handle such cases.<br>
+; After one operand in an instruction is constant-folded and the<br>
+; instruction is simplified, the other operand might become dead.<br>
+; In this test we have::<br>
+; for i in 1..10:<br>
+; r += A[i] * B[i]<br>
+; A[i] is 0 almost at every iteration, so there is no need in loading B[i] at<br>
+; all.<br>
+<br>
+<br>
+; CHECK-LABEL: @unroll_dce<br>
+; CHECK-NOT: br i1 %exitcond, label %for.end, label %for.body<br>
+define i32 @unroll_dce(i32* noalias nocapture readonly %b) {<br>
+entry:<br>
+ br label %for.body<br>
+<br>
+for.body: ; preds = %for.body, %entry<br>
+ %iv.0 = phi i64 [ 0, %entry ], [ %iv.1, %for.body ]<br>
+ %r.0 = phi i32 [ 0, %entry ], [ %r.1, %for.body ]<br>
+ %arrayidx1 = getelementptr inbounds [10 x i32], [10 x i32]* @known_constant, i64 0, i64 %iv.0<br>
+ %x1 = load i32, i32* %arrayidx1, align 4<br>
+ %arrayidx2 = getelementptr inbounds i32, i32* %b, i64 %iv.0<br>
+ %x2 = load i32, i32* %arrayidx2, align 4<br>
+ %mul = mul i32 %x1, %x2<br>
+ %r.1 = add i32 %mul, %r.0<br>
+ %iv.1 = add nuw nsw i64 %iv.0, 1<br>
+ %exitcond = icmp eq i64 %iv.1, 10<br>
+ br i1 %exitcond, label %for.end, label %for.body<br>
+<br>
+for.end: ; preds = %for.body<br>
+ ret i32 %r.1<br>
+}<br>
<br>
Modified: llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll (original)<br>
+++ llvm/trunk/test/Transforms/LoopUnroll/full-unroll-heuristics-geps.ll Thu May 12 20:42:39 2016<br>
@@ -1,4 +1,4 @@<br>
-; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=100 -unroll-dynamic-cost-savings-discount=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=40 | FileCheck %s<br>
+; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=100 -unroll-dynamic-cost-savings-discount=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=60 | FileCheck %s<br>
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"<br>
<br>
; When examining gep-instructions we shouldn't consider them simplified if the<br>
<br>
Modified: llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp?rev=269388&r1=269387&r2=269388&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp?rev=269388&r1=269387&r2=269388&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp (original)<br>
+++ llvm/trunk/unittests/Analysis/UnrollAnalyzer.cpp Thu May 12 20:42:39 2016<br>
@@ -134,6 +134,7 @@ TEST(UnrollAnalyzerTest, OuterLoopSimpli<br>
" br label %outer.loop\n"<br>
"outer.loop:\n"<br>
" %iv.outer = phi i64 [ 0, %entry ], [ %iv.outer.next, %outer.loop.latch ]\n"<br>
+ " %iv.outer.next = add nuw nsw i64 %iv.outer, 1\n"<br>
" br label %inner.loop\n"<br>
"inner.loop:\n"<br>
" %iv.inner = phi i64 [ 0, %outer.loop ], [ %iv.inner.next, %inner.loop ]\n"<br>
@@ -141,7 +142,6 @@ TEST(UnrollAnalyzerTest, OuterLoopSimpli<br>
" %exitcond.inner = icmp eq i64 %iv.inner.next, 1000\n"<br>
" br i1 %exitcond.inner, label %outer.loop.latch, label %inner.loop\n"<br>
"outer.loop.latch:\n"<br>
- " %iv.outer.next = add nuw nsw i64 %iv.outer, 1\n"<br>
" %exitcond.outer = icmp eq i64 %iv.outer.next, 40\n"<br>
" br i1 %exitcond.outer, label %exit, label %outer.loop\n"<br>
"exit:\n"<br>
@@ -163,11 +163,15 @@ TEST(UnrollAnalyzerTest, OuterLoopSimpli<br>
BasicBlock *InnerBody = &*FI++;<br>
<br>
BasicBlock::iterator BBI = Header->begin();<br>
- Instruction *Y1 = &*BBI++;<br>
+ BBI++;<br>
+ Instruction *Y1 = &*BBI;<br>
BBI = InnerBody->begin();<br>
- Instruction *Y2 = &*BBI++;<br>
+ BBI++;<br>
+ Instruction *Y2 = &*BBI;<br>
// Check that we can simplify IV of the outer loop, but can't simplify the IV<br>
// of the inner loop if we only know the iteration number of the outer loop.<br>
+ //<br>
+ // Y1 is %iv.outer.next, Y2 is %iv.inner.next<br>
auto I1 = SimplifiedValuesVector[0].find(Y1);<br>
EXPECT_TRUE(I1 != SimplifiedValuesVector[0].end());<br>
auto I2 = SimplifiedValuesVector[0].find(Y2);<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div></div>