[llvm] r270559 - Rework/enhance stack coloring data flow analysis.
Teresa Johnson via llvm-commits
llvm-commits at lists.llvm.org
Wed May 25 18:55:00 PDT 2016
On Wed, May 25, 2016 at 6:29 PM, Than McIntosh <thanm at google.com> wrote:
> Hello Teresa,
>
> Thanks for the heads-up. So far I have not seen any other issues.
>
> What is a "ThinLTO bootstrap using gold"?
>
Sorry, clang/llvm bootstrap using ThinLTO and the gold linker (since this
was linux).
The patch I referenced below just went in, but I will see if I can come up
with something smaller to reproduce.
Teresa
>
> Thanks, Than
>
>
> On Wed, May 25, 2016 at 7:32 PM, Teresa Johnson <tejohnson at google.com>
> wrote:
>
>> Hi Than,
>>
>> I started getting failures running llvm-tblgen yesterday in a ThinLTO
>> bootstrap of clang. I just bisected it to this commit.
>>
>> I'll work on packaging up a reproducer (a ThinLTO bootstrap using gold
>> requires D20559 to fix an issue handling archives, which I should be
>> submitting soon since it just got approved). I had earlier today tracked
>> this down to happening when we import a particular function, presumably
>> exposed with the expanded optimization scope after the importing.
>>
>> I wanted to give you a heads up though in case any other issues have been
>> reported.
>>
>> Thanks
>> Teresa
>>
>> On Tue, May 24, 2016 at 6:23 AM, Than McIntosh via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: thanm
>>> Date: Tue May 24 08:23:44 2016
>>> New Revision: 270559
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=270559&view=rev
>>> Log:
>>> Rework/enhance stack coloring data flow analysis.
>>>
>>> Replace bidirectional flow analysis to compute liveness with forward
>>> analysis pass. Treat lifetimes as starting when there is a first
>>> reference to the stack slot, as opposed to starting at the point of the
>>> lifetime.start intrinsic, so as to increase the number of stack
>>> variables we can overlap.
>>>
>>> Reviewers: gbiv, qcolumbet, wmi
>>> Differential Revision: http://reviews.llvm.org/D18827
>>>
>>> Bug: 25776
>>>
>>> Modified:
>>> llvm/trunk/lib/CodeGen/StackColoring.cpp
>>> llvm/trunk/test/CodeGen/X86/StackColoring.ll
>>> llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
>>>
>>> Modified: llvm/trunk/lib/CodeGen/StackColoring.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/StackColoring.cpp?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/StackColoring.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/StackColoring.cpp Tue May 24 08:23:44 2016
>>> @@ -64,18 +64,180 @@ DisableColoring("no-stack-coloring",
>>> /// The user may write code that uses allocas outside of the declared
>>> lifetime
>>> /// zone. This can happen when the user returns a reference to a local
>>> /// data-structure. We can detect these cases and decide not to
>>> optimize the
>>> -/// code. If this flag is enabled, we try to save the user.
>>> +/// code. If this flag is enabled, we try to save the user. This option
>>> +/// is treated as overriding LifetimeStartOnFirstUse below.
>>> static cl::opt<bool>
>>> ProtectFromEscapedAllocas("protect-from-escaped-allocas",
>>> cl::init(false), cl::Hidden,
>>> cl::desc("Do not optimize lifetime zones that
>>> "
>>> "are broken"));
>>>
>>> +/// Enable enhanced dataflow scheme for lifetime analysis (treat first
>>> +/// use of stack slot as start of slot lifetime, as opposed to looking
>>> +/// for LIFETIME_START marker). See "Implementation notes" below for
>>> +/// more info.
>>> +static cl::opt<bool>
>>> +LifetimeStartOnFirstUse("stackcoloring-lifetime-start-on-first-use",
>>> + cl::init(true), cl::Hidden,
>>> + cl::desc("Treat stack lifetimes as starting on first use, not
>>> on START marker."));
>>> +
>>> +
>>> STATISTIC(NumMarkerSeen, "Number of lifetime markers found.");
>>> STATISTIC(StackSpaceSaved, "Number of bytes saved due to merging
>>> slots.");
>>> STATISTIC(StackSlotMerged, "Number of stack slot merged.");
>>> STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime
>>> region");
>>>
>>> +//
>>> +// Implementation Notes:
>>> +// ---------------------
>>> +//
>>> +// Consider the following motivating example:
>>> +//
>>> +// int foo() {
>>> +// char b1[1024], b2[1024];
>>> +// if (...) {
>>> +// char b3[1024];
>>> +// <uses of b1, b3>;
>>> +// return x;
>>> +// } else {
>>> +// char b4[1024], b5[1024];
>>> +// <uses of b2, b4, b5>;
>>> +// return y;
>>> +// }
>>> +// }
>>> +//
>>> +// In the code above, "b3" and "b4" are declared in distinct lexical
>>> +// scopes, meaning that it is easy to prove that they can share the
>>> +// same stack slot. Variables "b1" and "b2" are declared in the same
>>> +// scope, meaning that from a lexical point of view, their lifetimes
>>> +// overlap. From a control flow pointer of view, however, the two
>>> +// variables are accessed in disjoint regions of the CFG, thus it
>>> +// should be possible for them to share the same stack slot. An ideal
>>> +// stack allocation for the function above would look like:
>>> +//
>>> +// slot 0: b1, b2
>>> +// slot 1: b3, b4
>>> +// slot 2: b5
>>> +//
>>> +// Achieving this allocation is tricky, however, due to the way
>>> +// lifetime markers are inserted. Here is a simplified view of the
>>> +// control flow graph for the code above:
>>> +//
>>> +// +------ block 0 -------+
>>> +// 0| LIFETIME_START b1, b2 |
>>> +// 1| <test 'if' condition> |
>>> +// +-----------------------+
>>> +// ./ \.
>>> +// +------ block 1 -------+ +------ block 2 -------+
>>> +// 2| LIFETIME_START b3 | 5| LIFETIME_START b4, b5 |
>>> +// 3| <uses of b1, b3> | 6| <uses of b2, b4, b5> |
>>> +// 4| LIFETIME_END b3 | 7| LIFETIME_END b4, b5 |
>>> +// +-----------------------+ +-----------------------+
>>> +// \. /.
>>> +// +------ block 3 -------+
>>> +// 8| <cleanupcode> |
>>> +// 9| LIFETIME_END b1, b2 |
>>> +// 10| return |
>>> +// +-----------------------+
>>> +//
>>> +// If we create live intervals for the variables above strictly based
>>> +// on the lifetime markers, we'll get the set of intervals on the
>>> +// left. If we ignore the lifetime start markers and instead treat a
>>> +// variable's lifetime as beginning with the first reference to the
>>> +// var, then we get the intervals on the right.
>>> +//
>>> +// LIFETIME_START First Use
>>> +// b1: [0,9] [3,4] [8,9]
>>> +// b2: [0,9] [6,9]
>>> +// b3: [2,4] [3,4]
>>> +// b4: [5,7] [6,7]
>>> +// b5: [5,7] [6,7]
>>> +//
>>> +// For the intervals on the left, the best we can do is overlap two
>>> +// variables (b3 and b4, for example); this gives us a stack size of
>>> +// 4*1024 bytes, not ideal. When treating first-use as the start of a
>>> +// lifetime, we can additionally overlap b1 and b5, giving us a 3*1024
>>> +// byte stack (better).
>>> +//
>>> +// Relying entirely on first-use of stack slots is problematic,
>>> +// however, due to the fact that optimizations can sometimes migrate
>>> +// uses of a variable outside of its lifetime start/end region. Here
>>> +// is an example:
>>> +//
>>> +// int bar() {
>>> +// char b1[1024], b2[1024];
>>> +// if (...) {
>>> +// <uses of b2>
>>> +// return y;
>>> +// } else {
>>> +// <uses of b1>
>>> +// while (...) {
>>> +// char b3[1024];
>>> +// <uses of b3>
>>> +// }
>>> +// }
>>> +// }
>>> +//
>>> +// Before optimization, the control flow graph for the code above
>>> +// might look like the following:
>>> +//
>>> +// +------ block 0 -------+
>>> +// 0| LIFETIME_START b1, b2 |
>>> +// 1| <test 'if' condition> |
>>> +// +-----------------------+
>>> +// ./ \.
>>> +// +------ block 1 -------+ +------- block 2 -------+
>>> +// 2| <uses of b2> | 3| <uses of b1> |
>>> +// +-----------------------+ +-----------------------+
>>> +// | |
>>> +// | +------- block 3 -------+ <-\.
>>> +// | 4| <while condition> | |
>>> +// | +-----------------------+ |
>>> +// | / | |
>>> +// | / +------- block 4 -------+
>>> +// \ / 5| LIFETIME_START b3 | |
>>> +// \ / 6| <uses of b3> | |
>>> +// \ / 7| LIFETIME_END b3 | |
>>> +// \ | +------------------------+ |
>>> +// \ | \ /
>>> +// +------ block 5 -----+ \---------------
>>> +// 8| <cleanupcode> |
>>> +// 9| LIFETIME_END b1, b2 |
>>> +// 10| return |
>>> +// +---------------------+
>>> +//
>>> +// During optimization, however, it can happen that an instruction
>>> +// computing an address in "b3" (for example, a loop-invariant GEP) is
>>> +// hoisted up out of the loop from block 4 to block 2. [Note that
>>> +// this is not an actual load from the stack, only an instruction that
>>> +// computes the address to be loaded]. If this happens, there is now a
>>> +// path leading from the first use of b3 to the return instruction
>>> +// that does not encounter the b3 LIFETIME_END, hence b3's lifetime is
>>> +// now larger than if we were computing live intervals strictly based
>>> +// on lifetime markers. In the example above, this lengthened lifetime
>>> +// would mean that it would appear illegal to overlap b3 with b2.
>>> +//
>>> +// To deal with this such cases, the code in ::collectMarkers() below
>>> +// tries to identify "degenerate" slots -- those slots where on a single
>>> +// forward pass through the CFG we encounter a first reference to slot
>>> +// K before we hit the slot K lifetime start marker. For such slots,
>>> +// we fall back on using the lifetime start marker as the beginning of
>>> +// the variable's lifetime. NB: with this implementation, slots can
>>> +// appear degenerate in cases where there is unstructured control flow:
>>> +//
>>> +// if (q) goto mid;
>>> +// if (x > 9) {
>>> +// int b[100];
>>> +// memcpy(&b[0], ...);
>>> +// mid: b[k] = ...;
>>> +// abc(&b);
>>> +// }
>>> +//
>>> +// If in RPO ordering chosen to walk the CFG we happen to visit the
>>> b[k]
>>> +// before visiting the memcpy block (which will contain the lifetime
>>> start
>>> +// for "b" then it will appear that 'b' has a degenerate lifetime.
>>> +//
>>> +
>>>
>>> //===----------------------------------------------------------------------===//
>>> // StackColoring Pass
>>>
>>> //===----------------------------------------------------------------------===//
>>> @@ -123,6 +285,17 @@ class StackColoring : public MachineFunc
>>> /// once the coloring is done.
>>> SmallVector<MachineInstr*, 8> Markers;
>>>
>>> + /// Record the FI slots for which we have seen some sort of
>>> + /// lifetime marker (either start or end).
>>> + BitVector InterestingSlots;
>>> +
>>> + /// Degenerate slots -- first use appears outside of start/end
>>> + /// lifetime markers.
>>> + BitVector DegenerateSlots;
>>> +
>>> + /// Number of iterations taken during data flow analysis.
>>> + unsigned NumIterations;
>>> +
>>> public:
>>> static char ID;
>>> StackColoring() : MachineFunctionPass(ID) {
>>> @@ -153,6 +326,25 @@ private:
>>> /// in and out blocks.
>>> void calculateLocalLiveness();
>>>
>>> + /// Returns TRUE if we're using the first-use-begins-lifetime method
>>> for
>>> + /// this slot (if FALSE, then the start marker is treated as start of
>>> lifetime).
>>> + bool applyFirstUse(int Slot) {
>>> + if (!LifetimeStartOnFirstUse || ProtectFromEscapedAllocas)
>>> + return false;
>>> + if (DegenerateSlots.test(Slot))
>>> + return false;
>>> + return true;
>>> + }
>>> +
>>> + /// Examines the specified instruction and returns TRUE if the
>>> instruction
>>> + /// represents the start or end of an interesting lifetime. The slot
>>> or slots
>>> + /// starting or ending are added to the vector "slots" and "isStart"
>>> is set
>>> + /// accordingly.
>>> + /// \returns True if inst contains a lifetime start or end
>>> + bool isLifetimeStartOrEnd(const MachineInstr &MI,
>>> + SmallVector<int, 4> &slots,
>>> + bool &isStart);
>>> +
>>> /// Construct the LiveIntervals for the slots.
>>> void calculateLiveIntervals(unsigned NumSlots);
>>>
>>> @@ -170,7 +362,10 @@ private:
>>>
>>> /// Map entries which point to other entries to their destination.
>>> /// A->B->C becomes A->C.
>>> - void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned
>>> NumSlots);
>>> + void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
>>> +
>>> + /// Used in collectMarkers
>>> + typedef DenseMap<const MachineBasicBlock*, BitVector> BlockBitVecMap;
>>> };
>>> } // end anonymous namespace
>>>
>>> @@ -228,9 +423,140 @@ LLVM_DUMP_METHOD void StackColoring::dum
>>>
>>> #endif // not NDEBUG
>>>
>>> -unsigned StackColoring::collectMarkers(unsigned NumSlot) {
>>> +static inline int getStartOrEndSlot(const MachineInstr &MI)
>>> +{
>>> + assert((MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> + MI.getOpcode() == TargetOpcode::LIFETIME_END) &&
>>> + "Expected LIFETIME_START or LIFETIME_END op");
>>> + const MachineOperand &MO = MI.getOperand(0);
>>> + int Slot = MO.getIndex();
>>> + if (Slot >= 0)
>>> + return Slot;
>>> + return -1;
>>> +}
>>> +
>>> +//
>>> +// At the moment the only way to end a variable lifetime is with
>>> +// a VARIABLE_LIFETIME op (which can't contain a start). If things
>>> +// change and the IR allows for a single inst that both begins
>>> +// and ends lifetime(s), this interface will need to be reworked.
>>> +//
>>> +bool StackColoring::isLifetimeStartOrEnd(const MachineInstr &MI,
>>> + SmallVector<int, 4> &slots,
>>> + bool &isStart)
>>> +{
>>> + if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> + MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> + int Slot = getStartOrEndSlot(MI);
>>> + if (Slot < 0)
>>> + return false;
>>> + if (!InterestingSlots.test(Slot))
>>> + return false;
>>> + slots.push_back(Slot);
>>> + if (MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> + isStart = false;
>>> + return true;
>>> + }
>>> + if (! applyFirstUse(Slot)) {
>>> + isStart = true;
>>> + return true;
>>> + }
>>> + } else if (LifetimeStartOnFirstUse && !ProtectFromEscapedAllocas) {
>>> + if (! MI.isDebugValue()) {
>>> + bool found = false;
>>> + for (const MachineOperand &MO : MI.operands()) {
>>> + if (!MO.isFI())
>>> + continue;
>>> + int Slot = MO.getIndex();
>>> + if (Slot<0)
>>> + continue;
>>> + if (InterestingSlots.test(Slot) && applyFirstUse(Slot)) {
>>> + slots.push_back(Slot);
>>> + found = true;
>>> + }
>>> + }
>>> + if (found) {
>>> + isStart = true;
>>> + return true;
>>> + }
>>> + }
>>> + }
>>> + return false;
>>> +}
>>> +
>>> +unsigned StackColoring::collectMarkers(unsigned NumSlot)
>>> +{
>>> unsigned MarkersFound = 0;
>>> - // Scan the function to find all lifetime markers.
>>> + BlockBitVecMap SeenStartMap;
>>> + InterestingSlots.clear();
>>> + InterestingSlots.resize(NumSlot);
>>> + DegenerateSlots.clear();
>>> + DegenerateSlots.resize(NumSlot);
>>> +
>>> + // Step 1: collect markers and populate the "InterestingSlots"
>>> + // and "DegenerateSlots" sets.
>>> + for (MachineBasicBlock *MBB : depth_first(MF)) {
>>> +
>>> + // Compute the set of slots for which we've seen a START marker but
>>> have
>>> + // not yet seen an END marker at this point in the walk (e.g. on
>>> entry
>>> + // to this bb).
>>> + BitVector BetweenStartEnd;
>>> + BetweenStartEnd.resize(NumSlot);
>>> + for (MachineBasicBlock::const_pred_iterator PI = MBB->pred_begin(),
>>> + PE = MBB->pred_end(); PI != PE; ++PI) {
>>> + BlockBitVecMap::const_iterator I = SeenStartMap.find(*PI);
>>> + if (I != SeenStartMap.end()) {
>>> + BetweenStartEnd |= I->second;
>>> + }
>>> + }
>>> +
>>> + // Walk the instructions in the block to look for start/end ops.
>>> + for (MachineInstr &MI : *MBB) {
>>> + if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> + MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> + int Slot = getStartOrEndSlot(MI);
>>> + if (Slot < 0)
>>> + continue;
>>> + InterestingSlots.set(Slot);
>>> + if (MI.getOpcode() == TargetOpcode::LIFETIME_START)
>>> + BetweenStartEnd.set(Slot);
>>> + else
>>> + BetweenStartEnd.reset(Slot);
>>> + const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
>>> + if (Allocation) {
>>> + DEBUG(dbgs() << "Found a lifetime ");
>>> + DEBUG(dbgs() << (MI.getOpcode() ==
>>> TargetOpcode::LIFETIME_START
>>> + ? "start"
>>> + : "end"));
>>> + DEBUG(dbgs() << " marker for slot #" << Slot);
>>> + DEBUG(dbgs() << " with allocation: " << Allocation->getName()
>>> + << "\n");
>>> + }
>>> + Markers.push_back(&MI);
>>> + MarkersFound += 1;
>>> + } else {
>>> + for (const MachineOperand &MO : MI.operands()) {
>>> + if (!MO.isFI())
>>> + continue;
>>> + int Slot = MO.getIndex();
>>> + if (Slot < 0)
>>> + continue;
>>> + if (! BetweenStartEnd.test(Slot)) {
>>> + DegenerateSlots.set(Slot);
>>> + }
>>> + }
>>> + }
>>> + }
>>> + BitVector &SeenStart = SeenStartMap[MBB];
>>> + SeenStart |= BetweenStartEnd;
>>> + }
>>> + if (!MarkersFound) {
>>> + return 0;
>>> + }
>>> + DEBUG(dumpBV("Degenerate slots", DegenerateSlots));
>>> +
>>> + // Step 2: compute begin/end sets for each block
>>> +
>>> // NOTE: We use a reverse-post-order iteration to ensure that we
>>> obtain a
>>> // deterministic numbering, and because we'll need a post-order
>>> iteration
>>> // later for solving the liveness dataflow problem.
>>> @@ -246,37 +572,33 @@ unsigned StackColoring::collectMarkers(u
>>> BlockInfo.Begin.resize(NumSlot);
>>> BlockInfo.End.resize(NumSlot);
>>>
>>> + SmallVector<int, 4> slots;
>>> for (MachineInstr &MI : *MBB) {
>>> - if (MI.getOpcode() != TargetOpcode::LIFETIME_START &&
>>> - MI.getOpcode() != TargetOpcode::LIFETIME_END)
>>> - continue;
>>> -
>>> - bool IsStart = MI.getOpcode() == TargetOpcode::LIFETIME_START;
>>> - const MachineOperand &MO = MI.getOperand(0);
>>> - int Slot = MO.getIndex();
>>> - if (Slot < 0)
>>> - continue;
>>> -
>>> - Markers.push_back(&MI);
>>> -
>>> - MarkersFound++;
>>> -
>>> - const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
>>> - if (Allocation) {
>>> - DEBUG(dbgs()<<"Found a lifetime marker for slot #"<<Slot<<
>>> - " with allocation: "<< Allocation->getName()<<"\n");
>>> - }
>>> -
>>> - if (IsStart) {
>>> - BlockInfo.Begin.set(Slot);
>>> - } else {
>>> - if (BlockInfo.Begin.test(Slot)) {
>>> - // Allocas that start and end within a single block are
>>> handled
>>> - // specially when computing the LiveIntervals to avoid
>>> pessimizing
>>> - // the liveness propagation.
>>> - BlockInfo.Begin.reset(Slot);
>>> - } else {
>>> + bool isStart = false;
>>> + slots.clear();
>>> + if (isLifetimeStartOrEnd(MI, slots, isStart)) {
>>> + if (!isStart) {
>>> + assert(slots.size() == 1 && "unexpected: MI ends multiple
>>> slots");
>>> + int Slot = slots[0];
>>> + if (BlockInfo.Begin.test(Slot)) {
>>> + BlockInfo.Begin.reset(Slot);
>>> + }
>>> BlockInfo.End.set(Slot);
>>> + } else {
>>> + for (auto Slot : slots) {
>>> + DEBUG(dbgs() << "Found a use of slot #" << Slot);
>>> + DEBUG(dbgs() << " at BB#" << MBB->getNumber() << " index ");
>>> + DEBUG(Indexes->getInstructionIndex(MI).print(dbgs()));
>>> + const AllocaInst *Allocation =
>>> MFI->getObjectAllocation(Slot);
>>> + if (Allocation) {
>>> + DEBUG(dbgs() << " with allocation: "<<
>>> Allocation->getName());
>>> + }
>>> + DEBUG(dbgs() << "\n");
>>> + if (BlockInfo.End.test(Slot)) {
>>> + BlockInfo.End.reset(Slot);
>>> + }
>>> + BlockInfo.Begin.set(Slot);
>>> + }
>>> }
>>> }
>>> }
>>> @@ -287,90 +609,56 @@ unsigned StackColoring::collectMarkers(u
>>> return MarkersFound;
>>> }
>>>
>>> -void StackColoring::calculateLocalLiveness() {
>>> - // Perform a standard reverse dataflow computation to solve for
>>> - // global liveness. The BEGIN set here is equivalent to KILL in the
>>> standard
>>> - // formulation, and END is equivalent to GEN. The result of this
>>> computation
>>> - // is a map from blocks to bitvectors where the bitvectors represent
>>> which
>>> - // allocas are live in/out of that block.
>>> - SmallPtrSet<const MachineBasicBlock*, 8>
>>> BBSet(BasicBlockNumbering.begin(),
>>> -
>>> BasicBlockNumbering.end());
>>> - unsigned NumSSMIters = 0;
>>> +void StackColoring::calculateLocalLiveness()
>>> +{
>>> + unsigned NumIters = 0;
>>> bool changed = true;
>>> while (changed) {
>>> changed = false;
>>> - ++NumSSMIters;
>>> -
>>> - SmallPtrSet<const MachineBasicBlock*, 8> NextBBSet;
>>> + ++NumIters;
>>>
>>> for (const MachineBasicBlock *BB : BasicBlockNumbering) {
>>> - if (!BBSet.count(BB)) continue;
>>>
>>> // Use an iterator to avoid repeated lookups.
>>> LivenessMap::iterator BI = BlockLiveness.find(BB);
>>> assert(BI != BlockLiveness.end() && "Block not found");
>>> BlockLifetimeInfo &BlockInfo = BI->second;
>>>
>>> + // Compute LiveIn by unioning together the LiveOut sets of all
>>> preds.
>>> BitVector LocalLiveIn;
>>> - BitVector LocalLiveOut;
>>> -
>>> - // Forward propagation from begins to ends.
>>> for (MachineBasicBlock::const_pred_iterator PI = BB->pred_begin(),
>>> PE = BB->pred_end(); PI != PE; ++PI) {
>>> LivenessMap::const_iterator I = BlockLiveness.find(*PI);
>>> assert(I != BlockLiveness.end() && "Predecessor not found");
>>> LocalLiveIn |= I->second.LiveOut;
>>> }
>>> - LocalLiveIn |= BlockInfo.End;
>>> - LocalLiveIn.reset(BlockInfo.Begin);
>>> -
>>> - // Reverse propagation from ends to begins.
>>> - for (MachineBasicBlock::const_succ_iterator SI = BB->succ_begin(),
>>> - SE = BB->succ_end(); SI != SE; ++SI) {
>>> - LivenessMap::const_iterator I = BlockLiveness.find(*SI);
>>> - assert(I != BlockLiveness.end() && "Successor not found");
>>> - LocalLiveOut |= I->second.LiveIn;
>>> - }
>>> - LocalLiveOut |= BlockInfo.Begin;
>>> - LocalLiveOut.reset(BlockInfo.End);
>>> -
>>> - LocalLiveIn |= LocalLiveOut;
>>> - LocalLiveOut |= LocalLiveIn;
>>>
>>> - // After adopting the live bits, we need to turn-off the bits
>>> which
>>> - // are de-activated in this block.
>>> + // Compute LiveOut by subtracting out lifetimes that end in this
>>> + // block, then adding in lifetimes that begin in this block. If
>>> + // we have both BEGIN and END markers in the same basic block
>>> + // then we know that the BEGIN marker comes after the END,
>>> + // because we already handle the case where the BEGIN comes
>>> + // before the END when collecting the markers (and building the
>>> + // BEGIN/END vectors).
>>> + BitVector LocalLiveOut = LocalLiveIn;
>>> LocalLiveOut.reset(BlockInfo.End);
>>> - LocalLiveIn.reset(BlockInfo.Begin);
>>> -
>>> - // If we have both BEGIN and END markers in the same basic block
>>> then
>>> - // we know that the BEGIN marker comes after the END, because we
>>> already
>>> - // handle the case where the BEGIN comes before the END when
>>> collecting
>>> - // the markers (and building the BEGIN/END vectore).
>>> - // Want to enable the LIVE_IN and LIVE_OUT of slots that have both
>>> - // BEGIN and END because it means that the value lives before and
>>> after
>>> - // this basic block.
>>> - BitVector LocalEndBegin = BlockInfo.End;
>>> - LocalEndBegin &= BlockInfo.Begin;
>>> - LocalLiveIn |= LocalEndBegin;
>>> - LocalLiveOut |= LocalEndBegin;
>>> + LocalLiveOut |= BlockInfo.Begin;
>>>
>>> + // Update block LiveIn set, noting whether it has changed.
>>> if (LocalLiveIn.test(BlockInfo.LiveIn)) {
>>> changed = true;
>>> BlockInfo.LiveIn |= LocalLiveIn;
>>> -
>>> - NextBBSet.insert(BB->pred_begin(), BB->pred_end());
>>> }
>>>
>>> + // Update block LiveOut set, noting whether it has changed.
>>> if (LocalLiveOut.test(BlockInfo.LiveOut)) {
>>> changed = true;
>>> BlockInfo.LiveOut |= LocalLiveOut;
>>> -
>>> - NextBBSet.insert(BB->succ_begin(), BB->succ_end());
>>> }
>>> }
>>> -
>>> - BBSet = std::move(NextBBSet);
>>> }// while changed.
>>> +
>>> + NumIterations = NumIters;
>>> }
>>>
>>> void StackColoring::calculateLiveIntervals(unsigned NumSlots) {
>>> @@ -385,29 +673,22 @@ void StackColoring::calculateLiveInterva
>>> Finishes.clear();
>>> Finishes.resize(NumSlots);
>>>
>>> - // Create the interval for the basic blocks with lifetime markers
>>> in them.
>>> - for (const MachineInstr *MI : Markers) {
>>> - if (MI->getParent() != &MBB)
>>> - continue;
>>> + // Create the interval for the basic blocks containing lifetime
>>> begin/end.
>>> + for (const MachineInstr &MI : MBB) {
>>>
>>> - assert((MI->getOpcode() == TargetOpcode::LIFETIME_START ||
>>> - MI->getOpcode() == TargetOpcode::LIFETIME_END) &&
>>> - "Invalid Lifetime marker");
>>> -
>>> - bool IsStart = MI->getOpcode() == TargetOpcode::LIFETIME_START;
>>> - const MachineOperand &Mo = MI->getOperand(0);
>>> - int Slot = Mo.getIndex();
>>> - if (Slot < 0)
>>> + SmallVector<int, 4> slots;
>>> + bool IsStart = false;
>>> + if (!isLifetimeStartOrEnd(MI, slots, IsStart))
>>> continue;
>>> -
>>> - SlotIndex ThisIndex = Indexes->getInstructionIndex(*MI);
>>> -
>>> - if (IsStart) {
>>> - if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
>>> - Starts[Slot] = ThisIndex;
>>> - } else {
>>> - if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
>>> - Finishes[Slot] = ThisIndex;
>>> + SlotIndex ThisIndex = Indexes->getInstructionIndex(MI);
>>> + for (auto Slot : slots) {
>>> + if (IsStart) {
>>> + if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
>>> + Starts[Slot] = ThisIndex;
>>> + } else {
>>> + if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
>>> + Finishes[Slot] = ThisIndex;
>>> + }
>>> }
>>> }
>>>
>>> @@ -423,7 +704,29 @@ void StackColoring::calculateLiveInterva
>>> }
>>>
>>> for (unsigned i = 0; i < NumSlots; ++i) {
>>> - assert(Starts[i].isValid() == Finishes[i].isValid() && "Unmatched
>>> range");
>>> + //
>>> + // When LifetimeStartOnFirstUse is turned on, data flow analysis
>>> + // is forward (from starts to ends), not bidirectional. A
>>> + // consequence of this is that we can wind up in situations
>>> + // where Starts[i] is invalid but Finishes[i] is valid and vice
>>> + // versa. Example:
>>> + //
>>> + // LIFETIME_START x
>>> + // if (...) {
>>> + // <use of x>
>>> + // throw ...;
>>> + // }
>>> + // LIFETIME_END x
>>> + // return 2;
>>> + //
>>> + //
>>> + // Here the slot for "x" will not be live into the block
>>> + // containing the "return 2" (since lifetimes start with first
>>> + // use, not at the dominating LIFETIME_START marker).
>>> + //
>>> + if (Starts[i].isValid() && !Finishes[i].isValid()) {
>>> + Finishes[i] = Indexes->getMBBEndIdx(&MBB);
>>> + }
>>> if (!Starts[i].isValid())
>>> continue;
>>>
>>> @@ -684,7 +987,6 @@ bool StackColoring::runOnMachineFunction
>>> return false;
>>>
>>> SmallVector<int, 8> SortedSlots;
>>> -
>>> SortedSlots.reserve(NumSlots);
>>> Intervals.reserve(NumSlots);
>>>
>>> @@ -717,9 +1019,12 @@ bool StackColoring::runOnMachineFunction
>>>
>>> // Calculate the liveness of each block.
>>> calculateLocalLiveness();
>>> + DEBUG(dbgs() << "Dataflow iterations: " << NumIterations << "\n");
>>> + DEBUG(dump());
>>>
>>> // Propagate the liveness information.
>>> calculateLiveIntervals(NumSlots);
>>> + DEBUG(dumpIntervals());
>>>
>>> // Search for allocas which are used outside of the declared lifetime
>>> // markers.
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/StackColoring.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/StackColoring.ll?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/StackColoring.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/StackColoring.ll Tue May 24 08:23:44 2016
>>> @@ -87,7 +87,7 @@ bb3:
>>> }
>>>
>>> ;CHECK-LABEL: myCall_w4:
>>> -;YESCOLOR: subq $200, %rsp
>>> +;YESCOLOR: subq $120, %rsp
>>> ;NOCOLOR: subq $408, %rsp
>>>
>>> define i32 @myCall_w4(i32 %in) {
>>> @@ -217,7 +217,7 @@ bb3:
>>>
>>>
>>> ;CHECK-LABEL: myCall2_nostart:
>>> -;YESCOLOR: subq $144, %rsp
>>> +;YESCOLOR: subq $272, %rsp
>>> ;NOCOLOR: subq $272, %rsp
>>> define i32 @myCall2_nostart(i32 %in, i1 %d) {
>>> entry:
>>> @@ -425,6 +425,120 @@ define i32 @shady_range(i32 %argc, i8**
>>> ret i32 9
>>> }
>>>
>>> +; In this case 'itar1' and 'itar2' can't be overlapped if we treat
>>> +; lifetime.start as the beginning of the lifetime, but we can
>>> +; overlap if we consider first use of the slot as lifetime
>>> +; start. See llvm bug 25776.
>>> +
>>> +;CHECK-LABEL: ifthen_twoslots:
>>> +;YESCOLOR: subq $536, %rsp
>>> +;NOCOLOR: subq $1048, %rsp
>>> +
>>> +define i32 @ifthen_twoslots(i32 %x) #0 {
>>> +entry:
>>> + %retval = alloca i32, align 4
>>> + %x.addr = alloca i32, align 4
>>> + %itar1 = alloca [128 x i32], align 16
>>> + %itar2 = alloca [128 x i32], align 16
>>> + %cleanup.dest.slot = alloca i32
>>> + store i32 %x, i32* %x.addr, align 4
>>> + %itar1_start_8 = bitcast [128 x i32]* %itar1 to i8*
>>> + call void @llvm.lifetime.start(i64 512, i8* %itar1_start_8) #3
>>> + %itar2_start_8 = bitcast [128 x i32]* %itar2 to i8*
>>> + call void @llvm.lifetime.start(i64 512, i8* %itar2_start_8) #3
>>> + %xval = load i32, i32* %x.addr, align 4
>>> + %and = and i32 %xval, 1
>>> + %tobool = icmp ne i32 %and, 0
>>> + br i1 %tobool, label %if.then, label %if.else
>>> +
>>> +if.then: ; preds = %entry
>>> + %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]*
>>> %itar1, i32 0, i32 0
>>> + call void @inita(i32* %arraydecay)
>>> + store i32 1, i32* %retval, align 4
>>> + store i32 1, i32* %cleanup.dest.slot, align 4
>>> + %itar2_end_8 = bitcast [128 x i32]* %itar2 to i8*
>>> + call void @llvm.lifetime.end(i64 512, i8* %itar2_end_8) #3
>>> + %itar1_end_8 = bitcast [128 x i32]* %itar1 to i8*
>>> + call void @llvm.lifetime.end(i64 512, i8* %itar1_end_8) #3
>>> + br label %cleanup
>>> +
>>> +if.else: ; preds = %entry
>>> + %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]*
>>> %itar2, i32 0, i32 0
>>> + call void @inita(i32* %arraydecay1)
>>> + store i32 0, i32* %retval, align 4
>>> + store i32 1, i32* %cleanup.dest.slot, align 4
>>> + %itar2_end2_8 = bitcast [128 x i32]* %itar2 to i8*
>>> + call void @llvm.lifetime.end(i64 512, i8* %itar2_end2_8) #3
>>> + %itar1_end2_8 = bitcast [128 x i32]* %itar1 to i8*
>>> + call void @llvm.lifetime.end(i64 512, i8* %itar1_end2_8) #3
>>> + br label %cleanup
>>> +
>>> +cleanup: ; preds = %if.else,
>>> %if.then
>>> + %final_retval = load i32,
>>> + i32* %retval, align 4
>>> + ret i32 %final_retval
>>> +}
>>> +
>>> +; This function is intended to test the case where you
>>> +; have a reference to a stack slot that lies outside of
>>> +; the START/END lifetime markers-- the flow analysis
>>> +; should catch this and build the lifetime based on the
>>> +; markers only.
>>> +
>>> +;CHECK-LABEL: while_loop:
>>> +;YESCOLOR: subq $1032, %rsp
>>> +;NOCOLOR: subq $1544, %rsp
>>> +
>>> +define i32 @while_loop(i32 %x) #0 {
>>> +entry:
>>> + %b1 = alloca [128 x i32], align 16
>>> + %b2 = alloca [128 x i32], align 16
>>> + %b3 = alloca [128 x i32], align 16
>>> + %tmp = bitcast [128 x i32]* %b1 to i8*
>>> + call void @llvm.lifetime.start(i64 512, i8* %tmp) #3
>>> + %tmp1 = bitcast [128 x i32]* %b2 to i8*
>>> + call void @llvm.lifetime.start(i64 512, i8* %tmp1) #3
>>> + %and = and i32 %x, 1
>>> + %tobool = icmp eq i32 %and, 0
>>> + br i1 %tobool, label %if.else, label %if.then
>>> +
>>> +if.then: ; preds = %entry
>>> + %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %b2,
>>> i64 0, i64 0
>>> + call void @inita(i32* %arraydecay) #3
>>> + br label %if.end
>>> +
>>> +if.else: ; preds = %entry
>>> + %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %b1,
>>> i64 0, i64 0
>>> + call void @inita(i32* %arraydecay1) #3
>>> + %arraydecay3 = getelementptr inbounds [128 x i32], [128 x i32]* %b3,
>>> i64 0, i64 0
>>> + call void @inita(i32* %arraydecay3) #3
>>> + %tobool25 = icmp eq i32 %x, 0
>>> + br i1 %tobool25, label %if.end, label %while.body.lr.ph
>>> +
>>> +while.body.lr.ph: ; preds = %if.else
>>> + %tmp2 = bitcast [128 x i32]* %b3 to i8*
>>> + br label %while.body
>>> +
>>> +while.body: ; preds = %
>>> while.body.lr.ph, %while.body
>>> + %x.addr.06 = phi i32 [ %x, %while.body.lr.ph ], [ %dec, %while.body ]
>>> + %dec = add nsw i32 %x.addr.06, -1
>>> + call void @llvm.lifetime.start(i64 512, i8* %tmp2) #3
>>> + call void @inita(i32* %arraydecay3) #3
>>> + call void @llvm.lifetime.end(i64 512, i8* %tmp2) #3
>>> + %tobool2 = icmp eq i32 %dec, 0
>>> + br i1 %tobool2, label %if.end.loopexit, label %while.body
>>> +
>>> +if.end.loopexit: ; preds = %while.body
>>> + br label %if.end
>>> +
>>> +if.end: ; preds =
>>> %if.end.loopexit, %if.else, %if.then
>>> + call void @llvm.lifetime.end(i64 512, i8* %tmp1) #3
>>> + call void @llvm.lifetime.end(i64 512, i8* %tmp) #3
>>> + ret i32 0
>>> +}
>>> +
>>> +declare void @inita(i32*) #2
>>> +
>>> declare void @bar([100 x i32]* , [100 x i32]*) nounwind
>>>
>>> declare void @llvm.lifetime.start(i64, i8* nocapture) nounwind
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll Tue May 24
>>> 08:23:44 2016
>>> @@ -155,6 +155,7 @@ entry:
>>> %ref.tmp.i = alloca
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199", align 8
>>> %Op.i = alloca %"class.llvm::SDValue.3.603.963.1923.2043.2283.4083",
>>> align 8
>>> %0 = bitcast
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i to i8*
>>> + call void @llvm.lifetime.start(i64 24, i8* %0) #1
>>> %retval.sroa.0.0.idx.i36 = getelementptr inbounds
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199",
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i, i64
>>> 0, i32 1, i32 0, i32 0
>>> %retval.sroa.0.0.copyload.i37 = load i32, i32*
>>> %retval.sroa.0.0.idx.i36, align 8
>>> call void @llvm.lifetime.end(i64 24, i8* %0) #1
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com |
>> 408-460-2413
>>
>
>
--
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160525/38e54398/attachment.html>
More information about the llvm-commits
mailing list