<div dir="ltr">Hi Than,<div><br></div><div>I started getting failures running llvm-tblgen yesterday in a ThinLTO bootstrap of clang. I just bisected it to this commit. </div><div><br></div><div>I'll work on packaging up a reproducer (a ThinLTO bootstrap using gold requires D20559 to fix an issue handling archives, which I should be submitting soon since it just got approved). I had earlier today tracked this down to happening when we import a particular function, presumably exposed with the expanded optimization scope after the importing.</div><div><br></div><div>I wanted to give you a heads up though in case any other issues have been reported.</div><div><br></div><div>Thanks</div><div>Teresa</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 24, 2016 at 6:23 AM, Than McIntosh via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: thanm<br>
Date: Tue May 24 08:23:44 2016<br>
New Revision: 270559<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=270559&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=270559&view=rev</a><br>
Log:<br>
Rework/enhance stack coloring data flow analysis.<br>
<br>
Replace bidirectional flow analysis to compute liveness with forward<br>
analysis pass. Treat lifetimes as starting when there is a first<br>
reference to the stack slot, as opposed to starting at the point of the<br>
lifetime.start intrinsic, so as to increase the number of stack<br>
variables we can overlap.<br>
<br>
Reviewers: gbiv, qcolumbet, wmi<br>
Differential Revision: <a href="http://reviews.llvm.org/D18827" rel="noreferrer" target="_blank">http://reviews.llvm.org/D18827</a><br>
<br>
Bug: 25776<br>
<br>
Modified:<br>
llvm/trunk/lib/CodeGen/StackColoring.cpp<br>
llvm/trunk/test/CodeGen/X86/StackColoring.ll<br>
llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll<br>
<br>
Modified: llvm/trunk/lib/CodeGen/StackColoring.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/StackColoring.cpp?rev=270559&r1=270558&r2=270559&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/StackColoring.cpp?rev=270559&r1=270558&r2=270559&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/StackColoring.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/StackColoring.cpp Tue May 24 08:23:44 2016<br>
@@ -64,18 +64,180 @@ DisableColoring("no-stack-coloring",<br>
/// The user may write code that uses allocas outside of the declared lifetime<br>
/// zone. This can happen when the user returns a reference to a local<br>
/// data-structure. We can detect these cases and decide not to optimize the<br>
-/// code. If this flag is enabled, we try to save the user.<br>
+/// code. If this flag is enabled, we try to save the user. This option<br>
+/// is treated as overriding LifetimeStartOnFirstUse below.<br>
static cl::opt<bool><br>
ProtectFromEscapedAllocas("protect-from-escaped-allocas",<br>
cl::init(false), cl::Hidden,<br>
cl::desc("Do not optimize lifetime zones that "<br>
"are broken"));<br>
<br>
+/// Enable enhanced dataflow scheme for lifetime analysis (treat first<br>
+/// use of stack slot as start of slot lifetime, as opposed to looking<br>
+/// for LIFETIME_START marker). See "Implementation notes" below for<br>
+/// more info.<br>
+static cl::opt<bool><br>
+LifetimeStartOnFirstUse("stackcoloring-lifetime-start-on-first-use",<br>
+ cl::init(true), cl::Hidden,<br>
+ cl::desc("Treat stack lifetimes as starting on first use, not on START marker."));<br>
+<br>
+<br>
STATISTIC(NumMarkerSeen, "Number of lifetime markers found.");<br>
STATISTIC(StackSpaceSaved, "Number of bytes saved due to merging slots.");<br>
STATISTIC(StackSlotMerged, "Number of stack slot merged.");<br>
STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime region");<br>
<br>
+//<br>
+// Implementation Notes:<br>
+// ---------------------<br>
+//<br>
+// Consider the following motivating example:<br>
+//<br>
+// int foo() {<br>
+// char b1[1024], b2[1024];<br>
+// if (...) {<br>
+// char b3[1024];<br>
+// <uses of b1, b3>;<br>
+// return x;<br>
+// } else {<br>
+// char b4[1024], b5[1024];<br>
+// <uses of b2, b4, b5>;<br>
+// return y;<br>
+// }<br>
+// }<br>
+//<br>
+// In the code above, "b3" and "b4" are declared in distinct lexical<br>
+// scopes, meaning that it is easy to prove that they can share the<br>
+// same stack slot. Variables "b1" and "b2" are declared in the same<br>
+// scope, meaning that from a lexical point of view, their lifetimes<br>
+// overlap. From a control flow pointer of view, however, the two<br>
+// variables are accessed in disjoint regions of the CFG, thus it<br>
+// should be possible for them to share the same stack slot. An ideal<br>
+// stack allocation for the function above would look like:<br>
+//<br>
+// slot 0: b1, b2<br>
+// slot 1: b3, b4<br>
+// slot 2: b5<br>
+//<br>
+// Achieving this allocation is tricky, however, due to the way<br>
+// lifetime markers are inserted. Here is a simplified view of the<br>
+// control flow graph for the code above:<br>
+//<br>
+// +------ block 0 -------+<br>
+// 0| LIFETIME_START b1, b2 |<br>
+// 1| <test 'if' condition> |<br>
+// +-----------------------+<br>
+// ./ \.<br>
+// +------ block 1 -------+ +------ block 2 -------+<br>
+// 2| LIFETIME_START b3 | 5| LIFETIME_START b4, b5 |<br>
+// 3| <uses of b1, b3> | 6| <uses of b2, b4, b5> |<br>
+// 4| LIFETIME_END b3 | 7| LIFETIME_END b4, b5 |<br>
+// +-----------------------+ +-----------------------+<br>
+// \. /.<br>
+// +------ block 3 -------+<br>
+// 8| <cleanupcode> |<br>
+// 9| LIFETIME_END b1, b2 |<br>
+// 10| return |<br>
+// +-----------------------+<br>
+//<br>
+// If we create live intervals for the variables above strictly based<br>
+// on the lifetime markers, we'll get the set of intervals on the<br>
+// left. If we ignore the lifetime start markers and instead treat a<br>
+// variable's lifetime as beginning with the first reference to the<br>
+// var, then we get the intervals on the right.<br>
+//<br>
+// LIFETIME_START First Use<br>
+// b1: [0,9] [3,4] [8,9]<br>
+// b2: [0,9] [6,9]<br>
+// b3: [2,4] [3,4]<br>
+// b4: [5,7] [6,7]<br>
+// b5: [5,7] [6,7]<br>
+//<br>
+// For the intervals on the left, the best we can do is overlap two<br>
+// variables (b3 and b4, for example); this gives us a stack size of<br>
+// 4*1024 bytes, not ideal. When treating first-use as the start of a<br>
+// lifetime, we can additionally overlap b1 and b5, giving us a 3*1024<br>
+// byte stack (better).<br>
+//<br>
+// Relying entirely on first-use of stack slots is problematic,<br>
+// however, due to the fact that optimizations can sometimes migrate<br>
+// uses of a variable outside of its lifetime start/end region. Here<br>
+// is an example:<br>
+//<br>
+// int bar() {<br>
+// char b1[1024], b2[1024];<br>
+// if (...) {<br>
+// <uses of b2><br>
+// return y;<br>
+// } else {<br>
+// <uses of b1><br>
+// while (...) {<br>
+// char b3[1024];<br>
+// <uses of b3><br>
+// }<br>
+// }<br>
+// }<br>
+//<br>
+// Before optimization, the control flow graph for the code above<br>
+// might look like the following:<br>
+//<br>
+// +------ block 0 -------+<br>
+// 0| LIFETIME_START b1, b2 |<br>
+// 1| <test 'if' condition> |<br>
+// +-----------------------+<br>
+// ./ \.<br>
+// +------ block 1 -------+ +------- block 2 -------+<br>
+// 2| <uses of b2> | 3| <uses of b1> |<br>
+// +-----------------------+ +-----------------------+<br>
+// | |<br>
+// | +------- block 3 -------+ <-\.<br>
+// | 4| <while condition> | |<br>
+// | +-----------------------+ |<br>
+// | / | |<br>
+// | / +------- block 4 -------+<br>
+// \ / 5| LIFETIME_START b3 | |<br>
+// \ / 6| <uses of b3> | |<br>
+// \ / 7| LIFETIME_END b3 | |<br>
+// \ | +------------------------+ |<br>
+// \ | \ /<br>
+// +------ block 5 -----+ \---------------<br>
+// 8| <cleanupcode> |<br>
+// 9| LIFETIME_END b1, b2 |<br>
+// 10| return |<br>
+// +---------------------+<br>
+//<br>
+// During optimization, however, it can happen that an instruction<br>
+// computing an address in "b3" (for example, a loop-invariant GEP) is<br>
+// hoisted up out of the loop from block 4 to block 2. [Note that<br>
+// this is not an actual load from the stack, only an instruction that<br>
+// computes the address to be loaded]. If this happens, there is now a<br>
+// path leading from the first use of b3 to the return instruction<br>
+// that does not encounter the b3 LIFETIME_END, hence b3's lifetime is<br>
+// now larger than if we were computing live intervals strictly based<br>
+// on lifetime markers. In the example above, this lengthened lifetime<br>
+// would mean that it would appear illegal to overlap b3 with b2.<br>
+//<br>
+// To deal with this such cases, the code in ::collectMarkers() below<br>
+// tries to identify "degenerate" slots -- those slots where on a single<br>
+// forward pass through the CFG we encounter a first reference to slot<br>
+// K before we hit the slot K lifetime start marker. For such slots,<br>
+// we fall back on using the lifetime start marker as the beginning of<br>
+// the variable's lifetime. NB: with this implementation, slots can<br>
+// appear degenerate in cases where there is unstructured control flow:<br>
+//<br>
+// if (q) goto mid;<br>
+// if (x > 9) {<br>
+// int b[100];<br>
+// memcpy(&b[0], ...);<br>
+// mid: b[k] = ...;<br>
+// abc(&b);<br>
+// }<br>
+//<br>
+// If in RPO ordering chosen to walk the CFG we happen to visit the b[k]<br>
+// before visiting the memcpy block (which will contain the lifetime start<br>
+// for "b" then it will appear that 'b' has a degenerate lifetime.<br>
+//<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// StackColoring Pass<br>
//===----------------------------------------------------------------------===//<br>
@@ -123,6 +285,17 @@ class StackColoring : public MachineFunc<br>
/// once the coloring is done.<br>
SmallVector<MachineInstr*, 8> Markers;<br>
<br>
+ /// Record the FI slots for which we have seen some sort of<br>
+ /// lifetime marker (either start or end).<br>
+ BitVector InterestingSlots;<br>
+<br>
+ /// Degenerate slots -- first use appears outside of start/end<br>
+ /// lifetime markers.<br>
+ BitVector DegenerateSlots;<br>
+<br>
+ /// Number of iterations taken during data flow analysis.<br>
+ unsigned NumIterations;<br>
+<br>
public:<br>
static char ID;<br>
StackColoring() : MachineFunctionPass(ID) {<br>
@@ -153,6 +326,25 @@ private:<br>
/// in and out blocks.<br>
void calculateLocalLiveness();<br>
<br>
+ /// Returns TRUE if we're using the first-use-begins-lifetime method for<br>
+ /// this slot (if FALSE, then the start marker is treated as start of lifetime).<br>
+ bool applyFirstUse(int Slot) {<br>
+ if (!LifetimeStartOnFirstUse || ProtectFromEscapedAllocas)<br>
+ return false;<br>
+ if (DegenerateSlots.test(Slot))<br>
+ return false;<br>
+ return true;<br>
+ }<br>
+<br>
+ /// Examines the specified instruction and returns TRUE if the instruction<br>
+ /// represents the start or end of an interesting lifetime. The slot or slots<br>
+ /// starting or ending are added to the vector "slots" and "isStart" is set<br>
+ /// accordingly.<br>
+ /// \returns True if inst contains a lifetime start or end<br>
+ bool isLifetimeStartOrEnd(const MachineInstr &MI,<br>
+ SmallVector<int, 4> &slots,<br>
+ bool &isStart);<br>
+<br>
/// Construct the LiveIntervals for the slots.<br>
void calculateLiveIntervals(unsigned NumSlots);<br>
<br>
@@ -170,7 +362,10 @@ private:<br>
<br>
/// Map entries which point to other entries to their destination.<br>
/// A->B->C becomes A->C.<br>
- void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);<br>
+ void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);<br>
+<br>
+ /// Used in collectMarkers<br>
+ typedef DenseMap<const MachineBasicBlock*, BitVector> BlockBitVecMap;<br>
};<br>
} // end anonymous namespace<br>
<br>
@@ -228,9 +423,140 @@ LLVM_DUMP_METHOD void StackColoring::dum<br>
<br>
#endif // not NDEBUG<br>
<br>
-unsigned StackColoring::collectMarkers(unsigned NumSlot) {<br>
+static inline int getStartOrEndSlot(const MachineInstr &MI)<br>
+{<br>
+ assert((MI.getOpcode() == TargetOpcode::LIFETIME_START ||<br>
+ MI.getOpcode() == TargetOpcode::LIFETIME_END) &&<br>
+ "Expected LIFETIME_START or LIFETIME_END op");<br>
+ const MachineOperand &MO = MI.getOperand(0);<br>
+ int Slot = MO.getIndex();<br>
+ if (Slot >= 0)<br>
+ return Slot;<br>
+ return -1;<br>
+}<br>
+<br>
+//<br>
+// At the moment the only way to end a variable lifetime is with<br>
+// a VARIABLE_LIFETIME op (which can't contain a start). If things<br>
+// change and the IR allows for a single inst that both begins<br>
+// and ends lifetime(s), this interface will need to be reworked.<br>
+//<br>
+bool StackColoring::isLifetimeStartOrEnd(const MachineInstr &MI,<br>
+ SmallVector<int, 4> &slots,<br>
+ bool &isStart)<br>
+{<br>
+ if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||<br>
+ MI.getOpcode() == TargetOpcode::LIFETIME_END) {<br>
+ int Slot = getStartOrEndSlot(MI);<br>
+ if (Slot < 0)<br>
+ return false;<br>
+ if (!InterestingSlots.test(Slot))<br>
+ return false;<br>
+ slots.push_back(Slot);<br>
+ if (MI.getOpcode() == TargetOpcode::LIFETIME_END) {<br>
+ isStart = false;<br>
+ return true;<br>
+ }<br>
+ if (! applyFirstUse(Slot)) {<br>
+ isStart = true;<br>
+ return true;<br>
+ }<br>
+ } else if (LifetimeStartOnFirstUse && !ProtectFromEscapedAllocas) {<br>
+ if (! MI.isDebugValue()) {<br>
+ bool found = false;<br>
+ for (const MachineOperand &MO : MI.operands()) {<br>
+ if (!MO.isFI())<br>
+ continue;<br>
+ int Slot = MO.getIndex();<br>
+ if (Slot<0)<br>
+ continue;<br>
+ if (InterestingSlots.test(Slot) && applyFirstUse(Slot)) {<br>
+ slots.push_back(Slot);<br>
+ found = true;<br>
+ }<br>
+ }<br>
+ if (found) {<br>
+ isStart = true;<br>
+ return true;<br>
+ }<br>
+ }<br>
+ }<br>
+ return false;<br>
+}<br>
+<br>
+unsigned StackColoring::collectMarkers(unsigned NumSlot)<br>
+{<br>
unsigned MarkersFound = 0;<br>
- // Scan the function to find all lifetime markers.<br>
+ BlockBitVecMap SeenStartMap;<br>
+ InterestingSlots.clear();<br>
+ InterestingSlots.resize(NumSlot);<br>
+ DegenerateSlots.clear();<br>
+ DegenerateSlots.resize(NumSlot);<br>
+<br>
+ // Step 1: collect markers and populate the "InterestingSlots"<br>
+ // and "DegenerateSlots" sets.<br>
+ for (MachineBasicBlock *MBB : depth_first(MF)) {<br>
+<br>
+ // Compute the set of slots for which we've seen a START marker but have<br>
+ // not yet seen an END marker at this point in the walk (e.g. on entry<br>
+ // to this bb).<br>
+ BitVector BetweenStartEnd;<br>
+ BetweenStartEnd.resize(NumSlot);<br>
+ for (MachineBasicBlock::const_pred_iterator PI = MBB->pred_begin(),<br>
+ PE = MBB->pred_end(); PI != PE; ++PI) {<br>
+ BlockBitVecMap::const_iterator I = SeenStartMap.find(*PI);<br>
+ if (I != SeenStartMap.end()) {<br>
+ BetweenStartEnd |= I->second;<br>
+ }<br>
+ }<br>
+<br>
+ // Walk the instructions in the block to look for start/end ops.<br>
+ for (MachineInstr &MI : *MBB) {<br>
+ if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||<br>
+ MI.getOpcode() == TargetOpcode::LIFETIME_END) {<br>
+ int Slot = getStartOrEndSlot(MI);<br>
+ if (Slot < 0)<br>
+ continue;<br>
+ InterestingSlots.set(Slot);<br>
+ if (MI.getOpcode() == TargetOpcode::LIFETIME_START)<br>
+ BetweenStartEnd.set(Slot);<br>
+ else<br>
+ BetweenStartEnd.reset(Slot);<br>
+ const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);<br>
+ if (Allocation) {<br>
+ DEBUG(dbgs() << "Found a lifetime ");<br>
+ DEBUG(dbgs() << (MI.getOpcode() == TargetOpcode::LIFETIME_START<br>
+ ? "start"<br>
+ : "end"));<br>
+ DEBUG(dbgs() << " marker for slot #" << Slot);<br>
+ DEBUG(dbgs() << " with allocation: " << Allocation->getName()<br>
+ << "\n");<br>
+ }<br>
+ Markers.push_back(&MI);<br>
+ MarkersFound += 1;<br>
+ } else {<br>
+ for (const MachineOperand &MO : MI.operands()) {<br>
+ if (!MO.isFI())<br>
+ continue;<br>
+ int Slot = MO.getIndex();<br>
+ if (Slot < 0)<br>
+ continue;<br>
+ if (! BetweenStartEnd.test(Slot)) {<br>
+ DegenerateSlots.set(Slot);<br>
+ }<br>
+ }<br>
+ }<br>
+ }<br>
+ BitVector &SeenStart = SeenStartMap[MBB];<br>
+ SeenStart |= BetweenStartEnd;<br>
+ }<br>
+ if (!MarkersFound) {<br>
+ return 0;<br>
+ }<br>
+ DEBUG(dumpBV("Degenerate slots", DegenerateSlots));<br>
+<br>
+ // Step 2: compute begin/end sets for each block<br>
+<br>
// NOTE: We use a reverse-post-order iteration to ensure that we obtain a<br>
// deterministic numbering, and because we'll need a post-order iteration<br>
// later for solving the liveness dataflow problem.<br>
@@ -246,37 +572,33 @@ unsigned StackColoring::collectMarkers(u<br>
BlockInfo.Begin.resize(NumSlot);<br>
BlockInfo.End.resize(NumSlot);<br>
<br>
+ SmallVector<int, 4> slots;<br>
for (MachineInstr &MI : *MBB) {<br>
- if (MI.getOpcode() != TargetOpcode::LIFETIME_START &&<br>
- MI.getOpcode() != TargetOpcode::LIFETIME_END)<br>
- continue;<br>
-<br>
- bool IsStart = MI.getOpcode() == TargetOpcode::LIFETIME_START;<br>
- const MachineOperand &MO = MI.getOperand(0);<br>
- int Slot = MO.getIndex();<br>
- if (Slot < 0)<br>
- continue;<br>
-<br>
- Markers.push_back(&MI);<br>
-<br>
- MarkersFound++;<br>
-<br>
- const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);<br>
- if (Allocation) {<br>
- DEBUG(dbgs()<<"Found a lifetime marker for slot #"<<Slot<<<br>
- " with allocation: "<< Allocation->getName()<<"\n");<br>
- }<br>
-<br>
- if (IsStart) {<br>
- BlockInfo.Begin.set(Slot);<br>
- } else {<br>
- if (BlockInfo.Begin.test(Slot)) {<br>
- // Allocas that start and end within a single block are handled<br>
- // specially when computing the LiveIntervals to avoid pessimizing<br>
- // the liveness propagation.<br>
- BlockInfo.Begin.reset(Slot);<br>
- } else {<br>
+ bool isStart = false;<br>
+ slots.clear();<br>
+ if (isLifetimeStartOrEnd(MI, slots, isStart)) {<br>
+ if (!isStart) {<br>
+ assert(slots.size() == 1 && "unexpected: MI ends multiple slots");<br>
+ int Slot = slots[0];<br>
+ if (BlockInfo.Begin.test(Slot)) {<br>
+ BlockInfo.Begin.reset(Slot);<br>
+ }<br>
BlockInfo.End.set(Slot);<br>
+ } else {<br>
+ for (auto Slot : slots) {<br>
+ DEBUG(dbgs() << "Found a use of slot #" << Slot);<br>
+ DEBUG(dbgs() << " at BB#" << MBB->getNumber() << " index ");<br>
+ DEBUG(Indexes->getInstructionIndex(MI).print(dbgs()));<br>
+ const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);<br>
+ if (Allocation) {<br>
+ DEBUG(dbgs() << " with allocation: "<< Allocation->getName());<br>
+ }<br>
+ DEBUG(dbgs() << "\n");<br>
+ if (BlockInfo.End.test(Slot)) {<br>
+ BlockInfo.End.reset(Slot);<br>
+ }<br>
+ BlockInfo.Begin.set(Slot);<br>
+ }<br>
}<br>
}<br>
}<br>
@@ -287,90 +609,56 @@ unsigned StackColoring::collectMarkers(u<br>
return MarkersFound;<br>
}<br>
<br>
-void StackColoring::calculateLocalLiveness() {<br>
- // Perform a standard reverse dataflow computation to solve for<br>
- // global liveness. The BEGIN set here is equivalent to KILL in the standard<br>
- // formulation, and END is equivalent to GEN. The result of this computation<br>
- // is a map from blocks to bitvectors where the bitvectors represent which<br>
- // allocas are live in/out of that block.<br>
- SmallPtrSet<const MachineBasicBlock*, 8> BBSet(BasicBlockNumbering.begin(),<br>
- BasicBlockNumbering.end());<br>
- unsigned NumSSMIters = 0;<br>
+void StackColoring::calculateLocalLiveness()<br>
+{<br>
+ unsigned NumIters = 0;<br>
bool changed = true;<br>
while (changed) {<br>
changed = false;<br>
- ++NumSSMIters;<br>
-<br>
- SmallPtrSet<const MachineBasicBlock*, 8> NextBBSet;<br>
+ ++NumIters;<br>
<br>
for (const MachineBasicBlock *BB : BasicBlockNumbering) {<br>
- if (!BBSet.count(BB)) continue;<br>
<br>
// Use an iterator to avoid repeated lookups.<br>
LivenessMap::iterator BI = BlockLiveness.find(BB);<br>
assert(BI != BlockLiveness.end() && "Block not found");<br>
BlockLifetimeInfo &BlockInfo = BI->second;<br>
<br>
+ // Compute LiveIn by unioning together the LiveOut sets of all preds.<br>
BitVector LocalLiveIn;<br>
- BitVector LocalLiveOut;<br>
-<br>
- // Forward propagation from begins to ends.<br>
for (MachineBasicBlock::const_pred_iterator PI = BB->pred_begin(),<br>
PE = BB->pred_end(); PI != PE; ++PI) {<br>
LivenessMap::const_iterator I = BlockLiveness.find(*PI);<br>
assert(I != BlockLiveness.end() && "Predecessor not found");<br>
LocalLiveIn |= I->second.LiveOut;<br>
}<br>
- LocalLiveIn |= BlockInfo.End;<br>
- LocalLiveIn.reset(BlockInfo.Begin);<br>
-<br>
- // Reverse propagation from ends to begins.<br>
- for (MachineBasicBlock::const_succ_iterator SI = BB->succ_begin(),<br>
- SE = BB->succ_end(); SI != SE; ++SI) {<br>
- LivenessMap::const_iterator I = BlockLiveness.find(*SI);<br>
- assert(I != BlockLiveness.end() && "Successor not found");<br>
- LocalLiveOut |= I->second.LiveIn;<br>
- }<br>
- LocalLiveOut |= BlockInfo.Begin;<br>
- LocalLiveOut.reset(BlockInfo.End);<br>
-<br>
- LocalLiveIn |= LocalLiveOut;<br>
- LocalLiveOut |= LocalLiveIn;<br>
<br>
- // After adopting the live bits, we need to turn-off the bits which<br>
- // are de-activated in this block.<br>
+ // Compute LiveOut by subtracting out lifetimes that end in this<br>
+ // block, then adding in lifetimes that begin in this block. If<br>
+ // we have both BEGIN and END markers in the same basic block<br>
+ // then we know that the BEGIN marker comes after the END,<br>
+ // because we already handle the case where the BEGIN comes<br>
+ // before the END when collecting the markers (and building the<br>
+ // BEGIN/END vectors).<br>
+ BitVector LocalLiveOut = LocalLiveIn;<br>
LocalLiveOut.reset(BlockInfo.End);<br>
- LocalLiveIn.reset(BlockInfo.Begin);<br>
-<br>
- // If we have both BEGIN and END markers in the same basic block then<br>
- // we know that the BEGIN marker comes after the END, because we already<br>
- // handle the case where the BEGIN comes before the END when collecting<br>
- // the markers (and building the BEGIN/END vectore).<br>
- // Want to enable the LIVE_IN and LIVE_OUT of slots that have both<br>
- // BEGIN and END because it means that the value lives before and after<br>
- // this basic block.<br>
- BitVector LocalEndBegin = BlockInfo.End;<br>
- LocalEndBegin &= BlockInfo.Begin;<br>
- LocalLiveIn |= LocalEndBegin;<br>
- LocalLiveOut |= LocalEndBegin;<br>
+ LocalLiveOut |= BlockInfo.Begin;<br>
<br>
+ // Update block LiveIn set, noting whether it has changed.<br>
if (LocalLiveIn.test(BlockInfo.LiveIn)) {<br>
changed = true;<br>
BlockInfo.LiveIn |= LocalLiveIn;<br>
-<br>
- NextBBSet.insert(BB->pred_begin(), BB->pred_end());<br>
}<br>
<br>
+ // Update block LiveOut set, noting whether it has changed.<br>
if (LocalLiveOut.test(BlockInfo.LiveOut)) {<br>
changed = true;<br>
BlockInfo.LiveOut |= LocalLiveOut;<br>
-<br>
- NextBBSet.insert(BB->succ_begin(), BB->succ_end());<br>
}<br>
}<br>
-<br>
- BBSet = std::move(NextBBSet);<br>
}// while changed.<br>
+<br>
+ NumIterations = NumIters;<br>
}<br>
<br>
void StackColoring::calculateLiveIntervals(unsigned NumSlots) {<br>
@@ -385,29 +673,22 @@ void StackColoring::calculateLiveInterva<br>
Finishes.clear();<br>
Finishes.resize(NumSlots);<br>
<br>
- // Create the interval for the basic blocks with lifetime markers in them.<br>
- for (const MachineInstr *MI : Markers) {<br>
- if (MI->getParent() != &MBB)<br>
- continue;<br>
+ // Create the interval for the basic blocks containing lifetime begin/end.<br>
+ for (const MachineInstr &MI : MBB) {<br>
<br>
- assert((MI->getOpcode() == TargetOpcode::LIFETIME_START ||<br>
- MI->getOpcode() == TargetOpcode::LIFETIME_END) &&<br>
- "Invalid Lifetime marker");<br>
-<br>
- bool IsStart = MI->getOpcode() == TargetOpcode::LIFETIME_START;<br>
- const MachineOperand &Mo = MI->getOperand(0);<br>
- int Slot = Mo.getIndex();<br>
- if (Slot < 0)<br>
+ SmallVector<int, 4> slots;<br>
+ bool IsStart = false;<br>
+ if (!isLifetimeStartOrEnd(MI, slots, IsStart))<br>
continue;<br>
-<br>
- SlotIndex ThisIndex = Indexes->getInstructionIndex(*MI);<br>
-<br>
- if (IsStart) {<br>
- if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)<br>
- Starts[Slot] = ThisIndex;<br>
- } else {<br>
- if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)<br>
- Finishes[Slot] = ThisIndex;<br>
+ SlotIndex ThisIndex = Indexes->getInstructionIndex(MI);<br>
+ for (auto Slot : slots) {<br>
+ if (IsStart) {<br>
+ if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)<br>
+ Starts[Slot] = ThisIndex;<br>
+ } else {<br>
+ if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)<br>
+ Finishes[Slot] = ThisIndex;<br>
+ }<br>
}<br>
}<br>
<br>
@@ -423,7 +704,29 @@ void StackColoring::calculateLiveInterva<br>
}<br>
<br>
for (unsigned i = 0; i < NumSlots; ++i) {<br>
- assert(Starts[i].isValid() == Finishes[i].isValid() && "Unmatched range");<br>
+ //<br>
+ // When LifetimeStartOnFirstUse is turned on, data flow analysis<br>
+ // is forward (from starts to ends), not bidirectional. A<br>
+ // consequence of this is that we can wind up in situations<br>
+ // where Starts[i] is invalid but Finishes[i] is valid and vice<br>
+ // versa. Example:<br>
+ //<br>
+ // LIFETIME_START x<br>
+ // if (...) {<br>
+ // <use of x><br>
+ // throw ...;<br>
+ // }<br>
+ // LIFETIME_END x<br>
+ // return 2;<br>
+ //<br>
+ //<br>
+ // Here the slot for "x" will not be live into the block<br>
+ // containing the "return 2" (since lifetimes start with first<br>
+ // use, not at the dominating LIFETIME_START marker).<br>
+ //<br>
+ if (Starts[i].isValid() && !Finishes[i].isValid()) {<br>
+ Finishes[i] = Indexes->getMBBEndIdx(&MBB);<br>
+ }<br>
if (!Starts[i].isValid())<br>
continue;<br>
<br>
@@ -684,7 +987,6 @@ bool StackColoring::runOnMachineFunction<br>
return false;<br>
<br>
SmallVector<int, 8> SortedSlots;<br>
-<br>
SortedSlots.reserve(NumSlots);<br>
Intervals.reserve(NumSlots);<br>
<br>
@@ -717,9 +1019,12 @@ bool StackColoring::runOnMachineFunction<br>
<br>
// Calculate the liveness of each block.<br>
calculateLocalLiveness();<br>
+ DEBUG(dbgs() << "Dataflow iterations: " << NumIterations << "\n");<br>
+ DEBUG(dump());<br>
<br>
// Propagate the liveness information.<br>
calculateLiveIntervals(NumSlots);<br>
+ DEBUG(dumpIntervals());<br>
<br>
// Search for allocas which are used outside of the declared lifetime<br>
// markers.<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/StackColoring.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/StackColoring.ll?rev=270559&r1=270558&r2=270559&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/StackColoring.ll?rev=270559&r1=270558&r2=270559&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/StackColoring.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/StackColoring.ll Tue May 24 08:23:44 2016<br>
@@ -87,7 +87,7 @@ bb3:<br>
}<br>
<br>
;CHECK-LABEL: myCall_w4:<br>
-;YESCOLOR: subq $200, %rsp<br>
+;YESCOLOR: subq $120, %rsp<br>
;NOCOLOR: subq $408, %rsp<br>
<br>
define i32 @myCall_w4(i32 %in) {<br>
@@ -217,7 +217,7 @@ bb3:<br>
<br>
<br>
;CHECK-LABEL: myCall2_nostart:<br>
-;YESCOLOR: subq $144, %rsp<br>
+;YESCOLOR: subq $272, %rsp<br>
;NOCOLOR: subq $272, %rsp<br>
define i32 @myCall2_nostart(i32 %in, i1 %d) {<br>
entry:<br>
@@ -425,6 +425,120 @@ define i32 @shady_range(i32 %argc, i8**<br>
ret i32 9<br>
}<br>
<br>
+; In this case 'itar1' and 'itar2' can't be overlapped if we treat<br>
+; lifetime.start as the beginning of the lifetime, but we can<br>
+; overlap if we consider first use of the slot as lifetime<br>
+; start. See llvm bug 25776.<br>
+<br>
+;CHECK-LABEL: ifthen_twoslots:<br>
+;YESCOLOR: subq $536, %rsp<br>
+;NOCOLOR: subq $1048, %rsp<br>
+<br>
+define i32 @ifthen_twoslots(i32 %x) #0 {<br>
+entry:<br>
+ %retval = alloca i32, align 4<br>
+ %x.addr = alloca i32, align 4<br>
+ %itar1 = alloca [128 x i32], align 16<br>
+ %itar2 = alloca [128 x i32], align 16<br>
+ %cleanup.dest.slot = alloca i32<br>
+ store i32 %x, i32* %x.addr, align 4<br>
+ %itar1_start_8 = bitcast [128 x i32]* %itar1 to i8*<br>
+ call void @llvm.lifetime.start(i64 512, i8* %itar1_start_8) #3<br>
+ %itar2_start_8 = bitcast [128 x i32]* %itar2 to i8*<br>
+ call void @llvm.lifetime.start(i64 512, i8* %itar2_start_8) #3<br>
+ %xval = load i32, i32* %x.addr, align 4<br>
+ %and = and i32 %xval, 1<br>
+ %tobool = icmp ne i32 %and, 0<br>
+ br i1 %tobool, label %if.then, label %if.else<br>
+<br>
+if.then: ; preds = %entry<br>
+ %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %itar1, i32 0, i32 0<br>
+ call void @inita(i32* %arraydecay)<br>
+ store i32 1, i32* %retval, align 4<br>
+ store i32 1, i32* %cleanup.dest.slot, align 4<br>
+ %itar2_end_8 = bitcast [128 x i32]* %itar2 to i8*<br>
+ call void @llvm.lifetime.end(i64 512, i8* %itar2_end_8) #3<br>
+ %itar1_end_8 = bitcast [128 x i32]* %itar1 to i8*<br>
+ call void @llvm.lifetime.end(i64 512, i8* %itar1_end_8) #3<br>
+ br label %cleanup<br>
+<br>
+if.else: ; preds = %entry<br>
+ %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %itar2, i32 0, i32 0<br>
+ call void @inita(i32* %arraydecay1)<br>
+ store i32 0, i32* %retval, align 4<br>
+ store i32 1, i32* %cleanup.dest.slot, align 4<br>
+ %itar2_end2_8 = bitcast [128 x i32]* %itar2 to i8*<br>
+ call void @llvm.lifetime.end(i64 512, i8* %itar2_end2_8) #3<br>
+ %itar1_end2_8 = bitcast [128 x i32]* %itar1 to i8*<br>
+ call void @llvm.lifetime.end(i64 512, i8* %itar1_end2_8) #3<br>
+ br label %cleanup<br>
+<br>
+cleanup: ; preds = %if.else, %if.then<br>
+ %final_retval = load i32,<br>
+ i32* %retval, align 4<br>
+ ret i32 %final_retval<br>
+}<br>
+<br>
+; This function is intended to test the case where you<br>
+; have a reference to a stack slot that lies outside of<br>
+; the START/END lifetime markers-- the flow analysis<br>
+; should catch this and build the lifetime based on the<br>
+; markers only.<br>
+<br>
+;CHECK-LABEL: while_loop:<br>
+;YESCOLOR: subq $1032, %rsp<br>
+;NOCOLOR: subq $1544, %rsp<br>
+<br>
+define i32 @while_loop(i32 %x) #0 {<br>
+entry:<br>
+ %b1 = alloca [128 x i32], align 16<br>
+ %b2 = alloca [128 x i32], align 16<br>
+ %b3 = alloca [128 x i32], align 16<br>
+ %tmp = bitcast [128 x i32]* %b1 to i8*<br>
+ call void @llvm.lifetime.start(i64 512, i8* %tmp) #3<br>
+ %tmp1 = bitcast [128 x i32]* %b2 to i8*<br>
+ call void @llvm.lifetime.start(i64 512, i8* %tmp1) #3<br>
+ %and = and i32 %x, 1<br>
+ %tobool = icmp eq i32 %and, 0<br>
+ br i1 %tobool, label %if.else, label %if.then<br>
+<br>
+if.then: ; preds = %entry<br>
+ %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %b2, i64 0, i64 0<br>
+ call void @inita(i32* %arraydecay) #3<br>
+ br label %if.end<br>
+<br>
+if.else: ; preds = %entry<br>
+ %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %b1, i64 0, i64 0<br>
+ call void @inita(i32* %arraydecay1) #3<br>
+ %arraydecay3 = getelementptr inbounds [128 x i32], [128 x i32]* %b3, i64 0, i64 0<br>
+ call void @inita(i32* %arraydecay3) #3<br>
+ %tobool25 = icmp eq i32 %x, 0<br>
+ br i1 %tobool25, label %if.end, label %<a href="http://while.body.lr.ph" rel="noreferrer" target="_blank">while.body.lr.ph</a><br>
+<br>
+<a href="http://while.body.lr.ph" rel="noreferrer" target="_blank">while.body.lr.ph</a>: ; preds = %if.else<br>
+ %tmp2 = bitcast [128 x i32]* %b3 to i8*<br>
+ br label %while.body<br>
+<br>
+while.body: ; preds = %<a href="http://while.body.lr.ph" rel="noreferrer" target="_blank">while.body.lr.ph</a>, %while.body<br>
+ %x.addr.06 = phi i32 [ %x, %<a href="http://while.body.lr.ph" rel="noreferrer" target="_blank">while.body.lr.ph</a> ], [ %dec, %while.body ]<br>
+ %dec = add nsw i32 %x.addr.06, -1<br>
+ call void @llvm.lifetime.start(i64 512, i8* %tmp2) #3<br>
+ call void @inita(i32* %arraydecay3) #3<br>
+ call void @llvm.lifetime.end(i64 512, i8* %tmp2) #3<br>
+ %tobool2 = icmp eq i32 %dec, 0<br>
+ br i1 %tobool2, label %if.end.loopexit, label %while.body<br>
+<br>
+if.end.loopexit: ; preds = %while.body<br>
+ br label %if.end<br>
+<br>
+if.end: ; preds = %if.end.loopexit, %if.else, %if.then<br>
+ call void @llvm.lifetime.end(i64 512, i8* %tmp1) #3<br>
+ call void @llvm.lifetime.end(i64 512, i8* %tmp) #3<br>
+ ret i32 0<br>
+}<br>
+<br>
+declare void @inita(i32*) #2<br>
+<br>
declare void @bar([100 x i32]* , [100 x i32]*) nounwind<br>
<br>
declare void @llvm.lifetime.start(i64, i8* nocapture) nounwind<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll?rev=270559&r1=270558&r2=270559&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll?rev=270559&r1=270558&r2=270559&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll Tue May 24 08:23:44 2016<br>
@@ -155,6 +155,7 @@ entry:<br>
%ref.tmp.i = alloca %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199", align 8<br>
%Op.i = alloca %"class.llvm::SDValue.3.603.963.1923.2043.2283.4083", align 8<br>
%0 = bitcast %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i to i8*<br>
+ call void @llvm.lifetime.start(i64 24, i8* %0) #1<br>
%retval.sroa.0.0.idx.i36 = getelementptr inbounds %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199", %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i, i64 0, i32 1, i32 0, i32 0<br>
%retval.sroa.0.0.copyload.i37 = load i32, i32* %retval.sroa.0.0.idx.i36, align 8<br>
call void @llvm.lifetime.end(i64 24, i8* %0) #1<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> 408-460-2413</td></tr></tbody></table></span></div>
</div></div>