[llvm] r270559 - Rework/enhance stack coloring data flow analysis.

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Wed May 25 18:55:00 PDT 2016


On Wed, May 25, 2016 at 6:29 PM, Than McIntosh <thanm at google.com> wrote:

> Hello Teresa,
>
> Thanks for the heads-up.  So far I have not seen any other issues.
>
> What is a "ThinLTO bootstrap using gold"?
>

Sorry, clang/llvm bootstrap using ThinLTO and the gold linker (since this
was linux).

The patch I referenced below just went in, but I will see if I can come up
with something smaller to reproduce.

Teresa


>
> Thanks, Than
>
>
> On Wed, May 25, 2016 at 7:32 PM, Teresa Johnson <tejohnson at google.com>
> wrote:
>
>> Hi Than,
>>
>> I started getting failures running llvm-tblgen yesterday in a ThinLTO
>> bootstrap of clang. I just bisected it to this commit.
>>
>> I'll work on packaging up a reproducer (a ThinLTO bootstrap using gold
>> requires D20559 to fix an issue handling archives, which I should be
>> submitting soon since it just got approved). I had earlier today tracked
>> this down to happening when we import a particular function, presumably
>> exposed with the expanded optimization scope after the importing.
>>
>> I wanted to give you a heads up though in case any other issues have been
>> reported.
>>
>> Thanks
>> Teresa
>>
>> On Tue, May 24, 2016 at 6:23 AM, Than McIntosh via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: thanm
>>> Date: Tue May 24 08:23:44 2016
>>> New Revision: 270559
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=270559&view=rev
>>> Log:
>>> Rework/enhance stack coloring data flow analysis.
>>>
>>> Replace bidirectional flow analysis to compute liveness with forward
>>> analysis pass. Treat lifetimes as starting when there is a first
>>> reference to the stack slot, as opposed to starting at the point of the
>>> lifetime.start intrinsic, so as to increase the number of stack
>>> variables we can overlap.
>>>
>>> Reviewers: gbiv, qcolumbet, wmi
>>> Differential Revision: http://reviews.llvm.org/D18827
>>>
>>> Bug: 25776
>>>
>>> Modified:
>>>     llvm/trunk/lib/CodeGen/StackColoring.cpp
>>>     llvm/trunk/test/CodeGen/X86/StackColoring.ll
>>>     llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
>>>
>>> Modified: llvm/trunk/lib/CodeGen/StackColoring.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/StackColoring.cpp?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/CodeGen/StackColoring.cpp (original)
>>> +++ llvm/trunk/lib/CodeGen/StackColoring.cpp Tue May 24 08:23:44 2016
>>> @@ -64,18 +64,180 @@ DisableColoring("no-stack-coloring",
>>>  /// The user may write code that uses allocas outside of the declared
>>> lifetime
>>>  /// zone. This can happen when the user returns a reference to a local
>>>  /// data-structure. We can detect these cases and decide not to
>>> optimize the
>>> -/// code. If this flag is enabled, we try to save the user.
>>> +/// code. If this flag is enabled, we try to save the user. This option
>>> +/// is treated as overriding LifetimeStartOnFirstUse below.
>>>  static cl::opt<bool>
>>>  ProtectFromEscapedAllocas("protect-from-escaped-allocas",
>>>                            cl::init(false), cl::Hidden,
>>>                            cl::desc("Do not optimize lifetime zones that
>>> "
>>>                                     "are broken"));
>>>
>>> +/// Enable enhanced dataflow scheme for lifetime analysis (treat first
>>> +/// use of stack slot as start of slot lifetime, as opposed to looking
>>> +/// for LIFETIME_START marker). See "Implementation notes" below for
>>> +/// more info.
>>> +static cl::opt<bool>
>>> +LifetimeStartOnFirstUse("stackcoloring-lifetime-start-on-first-use",
>>> +        cl::init(true), cl::Hidden,
>>> +        cl::desc("Treat stack lifetimes as starting on first use, not
>>> on START marker."));
>>> +
>>> +
>>>  STATISTIC(NumMarkerSeen,  "Number of lifetime markers found.");
>>>  STATISTIC(StackSpaceSaved, "Number of bytes saved due to merging
>>> slots.");
>>>  STATISTIC(StackSlotMerged, "Number of stack slot merged.");
>>>  STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime
>>> region");
>>>
>>> +//
>>> +// Implementation Notes:
>>> +// ---------------------
>>> +//
>>> +// Consider the following motivating example:
>>> +//
>>> +//     int foo() {
>>> +//       char b1[1024], b2[1024];
>>> +//       if (...) {
>>> +//         char b3[1024];
>>> +//         <uses of b1, b3>;
>>> +//         return x;
>>> +//       } else {
>>> +//         char b4[1024], b5[1024];
>>> +//         <uses of b2, b4, b5>;
>>> +//         return y;
>>> +//       }
>>> +//     }
>>> +//
>>> +// In the code above, "b3" and "b4" are declared in distinct lexical
>>> +// scopes, meaning that it is easy to prove that they can share the
>>> +// same stack slot. Variables "b1" and "b2" are declared in the same
>>> +// scope, meaning that from a lexical point of view, their lifetimes
>>> +// overlap. From a control flow pointer of view, however, the two
>>> +// variables are accessed in disjoint regions of the CFG, thus it
>>> +// should be possible for them to share the same stack slot. An ideal
>>> +// stack allocation for the function above would look like:
>>> +//
>>> +//     slot 0: b1, b2
>>> +//     slot 1: b3, b4
>>> +//     slot 2: b5
>>> +//
>>> +// Achieving this allocation is tricky, however, due to the way
>>> +// lifetime markers are inserted. Here is a simplified view of the
>>> +// control flow graph for the code above:
>>> +//
>>> +//                +------  block 0 -------+
>>> +//               0| LIFETIME_START b1, b2 |
>>> +//               1| <test 'if' condition> |
>>> +//                +-----------------------+
>>> +//                   ./              \.
>>> +//   +------  block 1 -------+   +------  block 2 -------+
>>> +//  2| LIFETIME_START b3     |  5| LIFETIME_START b4, b5 |
>>> +//  3| <uses of b1, b3>      |  6| <uses of b2, b4, b5>  |
>>> +//  4| LIFETIME_END b3       |  7| LIFETIME_END b4, b5   |
>>> +//   +-----------------------+   +-----------------------+
>>> +//                   \.              /.
>>> +//                +------  block 3 -------+
>>> +//               8| <cleanupcode>         |
>>> +//               9| LIFETIME_END b1, b2   |
>>> +//              10| return                |
>>> +//                +-----------------------+
>>> +//
>>> +// If we create live intervals for the variables above strictly based
>>> +// on the lifetime markers, we'll get the set of intervals on the
>>> +// left. If we ignore the lifetime start markers and instead treat a
>>> +// variable's lifetime as beginning with the first reference to the
>>> +// var, then we get the intervals on the right.
>>> +//
>>> +//            LIFETIME_START      First Use
>>> +//     b1:    [0,9]               [3,4] [8,9]
>>> +//     b2:    [0,9]               [6,9]
>>> +//     b3:    [2,4]               [3,4]
>>> +//     b4:    [5,7]               [6,7]
>>> +//     b5:    [5,7]               [6,7]
>>> +//
>>> +// For the intervals on the left, the best we can do is overlap two
>>> +// variables (b3 and b4, for example); this gives us a stack size of
>>> +// 4*1024 bytes, not ideal. When treating first-use as the start of a
>>> +// lifetime, we can additionally overlap b1 and b5, giving us a 3*1024
>>> +// byte stack (better).
>>> +//
>>> +// Relying entirely on first-use of stack slots is problematic,
>>> +// however, due to the fact that optimizations can sometimes migrate
>>> +// uses of a variable outside of its lifetime start/end region. Here
>>> +// is an example:
>>> +//
>>> +//     int bar() {
>>> +//       char b1[1024], b2[1024];
>>> +//       if (...) {
>>> +//         <uses of b2>
>>> +//         return y;
>>> +//       } else {
>>> +//         <uses of b1>
>>> +//         while (...) {
>>> +//           char b3[1024];
>>> +//           <uses of b3>
>>> +//         }
>>> +//       }
>>> +//     }
>>> +//
>>> +// Before optimization, the control flow graph for the code above
>>> +// might look like the following:
>>> +//
>>> +//                +------  block 0 -------+
>>> +//               0| LIFETIME_START b1, b2 |
>>> +//               1| <test 'if' condition> |
>>> +//                +-----------------------+
>>> +//                   ./              \.
>>> +//   +------  block 1 -------+    +------- block 2 -------+
>>> +//  2| <uses of b2>          |   3| <uses of b1>          |
>>> +//   +-----------------------+    +-----------------------+
>>> +//              |                            |
>>> +//              |                 +------- block 3 -------+ <-\.
>>> +//              |                4| <while condition>     |    |
>>> +//              |                 +-----------------------+    |
>>> +//              |               /          |                   |
>>> +//              |              /  +------- block 4 -------+
>>> +//              \             /  5| LIFETIME_START b3     |    |
>>> +//               \           /   6| <uses of b3>          |    |
>>> +//                \         /    7| LIFETIME_END b3       |    |
>>> +//                 \        |    +------------------------+    |
>>> +//                  \       |                 \                /
>>> +//                +------  block 5 -----+      \---------------
>>> +//               8| <cleanupcode>       |
>>> +//               9| LIFETIME_END b1, b2 |
>>> +//              10| return              |
>>> +//                +---------------------+
>>> +//
>>> +// During optimization, however, it can happen that an instruction
>>> +// computing an address in "b3" (for example, a loop-invariant GEP) is
>>> +// hoisted up out of the loop from block 4 to block 2.  [Note that
>>> +// this is not an actual load from the stack, only an instruction that
>>> +// computes the address to be loaded]. If this happens, there is now a
>>> +// path leading from the first use of b3 to the return instruction
>>> +// that does not encounter the b3 LIFETIME_END, hence b3's lifetime is
>>> +// now larger than if we were computing live intervals strictly based
>>> +// on lifetime markers. In the example above, this lengthened lifetime
>>> +// would mean that it would appear illegal to overlap b3 with b2.
>>> +//
>>> +// To deal with this such cases, the code in ::collectMarkers() below
>>> +// tries to identify "degenerate" slots -- those slots where on a single
>>> +// forward pass through the CFG we encounter a first reference to slot
>>> +// K before we hit the slot K lifetime start marker. For such slots,
>>> +// we fall back on using the lifetime start marker as the beginning of
>>> +// the variable's lifetime.  NB: with this implementation, slots can
>>> +// appear degenerate in cases where there is unstructured control flow:
>>> +//
>>> +//    if (q) goto mid;
>>> +//    if (x > 9) {
>>> +//         int b[100];
>>> +//         memcpy(&b[0], ...);
>>> +//    mid: b[k] = ...;
>>> +//         abc(&b);
>>> +//    }
>>> +//
>>> +// If in RPO ordering chosen to walk the CFG  we happen to visit the
>>> b[k]
>>> +// before visiting the memcpy block (which will contain the lifetime
>>> start
>>> +// for "b" then it will appear that 'b' has a degenerate lifetime.
>>> +//
>>> +
>>>
>>>  //===----------------------------------------------------------------------===//
>>>  //                           StackColoring Pass
>>>
>>>  //===----------------------------------------------------------------------===//
>>> @@ -123,6 +285,17 @@ class StackColoring : public MachineFunc
>>>    /// once the coloring is done.
>>>    SmallVector<MachineInstr*, 8> Markers;
>>>
>>> +  /// Record the FI slots for which we have seen some sort of
>>> +  /// lifetime marker (either start or end).
>>> +  BitVector InterestingSlots;
>>> +
>>> +  /// Degenerate slots -- first use appears outside of start/end
>>> +  /// lifetime markers.
>>> +  BitVector DegenerateSlots;
>>> +
>>> +  /// Number of iterations taken during data flow analysis.
>>> +  unsigned NumIterations;
>>> +
>>>  public:
>>>    static char ID;
>>>    StackColoring() : MachineFunctionPass(ID) {
>>> @@ -153,6 +326,25 @@ private:
>>>    /// in and out blocks.
>>>    void calculateLocalLiveness();
>>>
>>> +  /// Returns TRUE if we're using the first-use-begins-lifetime method
>>> for
>>> +  /// this slot (if FALSE, then the start marker is treated as start of
>>> lifetime).
>>> +  bool applyFirstUse(int Slot) {
>>> +    if (!LifetimeStartOnFirstUse || ProtectFromEscapedAllocas)
>>> +      return false;
>>> +    if (DegenerateSlots.test(Slot))
>>> +      return false;
>>> +    return true;
>>> +  }
>>> +
>>> +  /// Examines the specified instruction and returns TRUE if the
>>> instruction
>>> +  /// represents the start or end of an interesting lifetime. The slot
>>> or slots
>>> +  /// starting or ending are added to the vector "slots" and "isStart"
>>> is set
>>> +  /// accordingly.
>>> +  /// \returns True if inst contains a lifetime start or end
>>> +  bool isLifetimeStartOrEnd(const MachineInstr &MI,
>>> +                            SmallVector<int, 4> &slots,
>>> +                            bool &isStart);
>>> +
>>>    /// Construct the LiveIntervals for the slots.
>>>    void calculateLiveIntervals(unsigned NumSlots);
>>>
>>> @@ -170,7 +362,10 @@ private:
>>>
>>>    /// Map entries which point to other entries to their destination.
>>>    ///   A->B->C becomes A->C.
>>> -   void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned
>>> NumSlots);
>>> +  void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
>>> +
>>> +  /// Used in collectMarkers
>>> +  typedef DenseMap<const MachineBasicBlock*, BitVector> BlockBitVecMap;
>>>  };
>>>  } // end anonymous namespace
>>>
>>> @@ -228,9 +423,140 @@ LLVM_DUMP_METHOD void StackColoring::dum
>>>
>>>  #endif // not NDEBUG
>>>
>>> -unsigned StackColoring::collectMarkers(unsigned NumSlot) {
>>> +static inline int getStartOrEndSlot(const MachineInstr &MI)
>>> +{
>>> +  assert((MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> +          MI.getOpcode() == TargetOpcode::LIFETIME_END) &&
>>> +         "Expected LIFETIME_START or LIFETIME_END op");
>>> +  const MachineOperand &MO = MI.getOperand(0);
>>> +  int Slot = MO.getIndex();
>>> +  if (Slot >= 0)
>>> +    return Slot;
>>> +  return -1;
>>> +}
>>> +
>>> +//
>>> +// At the moment the only way to end a variable lifetime is with
>>> +// a VARIABLE_LIFETIME op (which can't contain a start). If things
>>> +// change and the IR allows for a single inst that both begins
>>> +// and ends lifetime(s), this interface will need to be reworked.
>>> +//
>>> +bool StackColoring::isLifetimeStartOrEnd(const MachineInstr &MI,
>>> +                                         SmallVector<int, 4> &slots,
>>> +                                         bool &isStart)
>>> +{
>>> +  if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> +      MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> +    int Slot = getStartOrEndSlot(MI);
>>> +    if (Slot < 0)
>>> +      return false;
>>> +    if (!InterestingSlots.test(Slot))
>>> +      return false;
>>> +    slots.push_back(Slot);
>>> +    if (MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> +      isStart = false;
>>> +      return true;
>>> +    }
>>> +    if (! applyFirstUse(Slot)) {
>>> +      isStart = true;
>>> +      return true;
>>> +    }
>>> +  } else if (LifetimeStartOnFirstUse && !ProtectFromEscapedAllocas) {
>>> +    if (! MI.isDebugValue()) {
>>> +      bool found = false;
>>> +      for (const MachineOperand &MO : MI.operands()) {
>>> +        if (!MO.isFI())
>>> +          continue;
>>> +        int Slot = MO.getIndex();
>>> +        if (Slot<0)
>>> +          continue;
>>> +        if (InterestingSlots.test(Slot) && applyFirstUse(Slot)) {
>>> +          slots.push_back(Slot);
>>> +          found = true;
>>> +        }
>>> +      }
>>> +      if (found) {
>>> +        isStart = true;
>>> +        return true;
>>> +      }
>>> +    }
>>> +  }
>>> +  return false;
>>> +}
>>> +
>>> +unsigned StackColoring::collectMarkers(unsigned NumSlot)
>>> +{
>>>    unsigned MarkersFound = 0;
>>> -  // Scan the function to find all lifetime markers.
>>> +  BlockBitVecMap SeenStartMap;
>>> +  InterestingSlots.clear();
>>> +  InterestingSlots.resize(NumSlot);
>>> +  DegenerateSlots.clear();
>>> +  DegenerateSlots.resize(NumSlot);
>>> +
>>> +  // Step 1: collect markers and populate the "InterestingSlots"
>>> +  // and "DegenerateSlots" sets.
>>> +  for (MachineBasicBlock *MBB : depth_first(MF)) {
>>> +
>>> +    // Compute the set of slots for which we've seen a START marker but
>>> have
>>> +    // not yet seen an END marker at this point in the walk (e.g. on
>>> entry
>>> +    // to this bb).
>>> +    BitVector BetweenStartEnd;
>>> +    BetweenStartEnd.resize(NumSlot);
>>> +    for (MachineBasicBlock::const_pred_iterator PI = MBB->pred_begin(),
>>> +             PE = MBB->pred_end(); PI != PE; ++PI) {
>>> +      BlockBitVecMap::const_iterator I = SeenStartMap.find(*PI);
>>> +      if (I != SeenStartMap.end()) {
>>> +        BetweenStartEnd |= I->second;
>>> +      }
>>> +    }
>>> +
>>> +    // Walk the instructions in the block to look for start/end ops.
>>> +    for (MachineInstr &MI : *MBB) {
>>> +      if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
>>> +          MI.getOpcode() == TargetOpcode::LIFETIME_END) {
>>> +        int Slot = getStartOrEndSlot(MI);
>>> +        if (Slot < 0)
>>> +          continue;
>>> +        InterestingSlots.set(Slot);
>>> +        if (MI.getOpcode() == TargetOpcode::LIFETIME_START)
>>> +          BetweenStartEnd.set(Slot);
>>> +        else
>>> +          BetweenStartEnd.reset(Slot);
>>> +        const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
>>> +        if (Allocation) {
>>> +          DEBUG(dbgs() << "Found a lifetime ");
>>> +          DEBUG(dbgs() << (MI.getOpcode() ==
>>> TargetOpcode::LIFETIME_START
>>> +                               ? "start"
>>> +                               : "end"));
>>> +          DEBUG(dbgs() << " marker for slot #" << Slot);
>>> +          DEBUG(dbgs() << " with allocation: " << Allocation->getName()
>>> +                       << "\n");
>>> +        }
>>> +        Markers.push_back(&MI);
>>> +        MarkersFound += 1;
>>> +      } else {
>>> +        for (const MachineOperand &MO : MI.operands()) {
>>> +          if (!MO.isFI())
>>> +            continue;
>>> +          int Slot = MO.getIndex();
>>> +          if (Slot < 0)
>>> +            continue;
>>> +          if (! BetweenStartEnd.test(Slot)) {
>>> +            DegenerateSlots.set(Slot);
>>> +          }
>>> +        }
>>> +      }
>>> +    }
>>> +    BitVector &SeenStart = SeenStartMap[MBB];
>>> +    SeenStart |= BetweenStartEnd;
>>> +  }
>>> +  if (!MarkersFound) {
>>> +    return 0;
>>> +  }
>>> +  DEBUG(dumpBV("Degenerate slots", DegenerateSlots));
>>> +
>>> +  // Step 2: compute begin/end sets for each block
>>> +
>>>    // NOTE: We use a reverse-post-order iteration to ensure that we
>>> obtain a
>>>    // deterministic numbering, and because we'll need a post-order
>>> iteration
>>>    // later for solving the liveness dataflow problem.
>>> @@ -246,37 +572,33 @@ unsigned StackColoring::collectMarkers(u
>>>      BlockInfo.Begin.resize(NumSlot);
>>>      BlockInfo.End.resize(NumSlot);
>>>
>>> +    SmallVector<int, 4> slots;
>>>      for (MachineInstr &MI : *MBB) {
>>> -      if (MI.getOpcode() != TargetOpcode::LIFETIME_START &&
>>> -          MI.getOpcode() != TargetOpcode::LIFETIME_END)
>>> -        continue;
>>> -
>>> -      bool IsStart = MI.getOpcode() == TargetOpcode::LIFETIME_START;
>>> -      const MachineOperand &MO = MI.getOperand(0);
>>> -      int Slot = MO.getIndex();
>>> -      if (Slot < 0)
>>> -        continue;
>>> -
>>> -      Markers.push_back(&MI);
>>> -
>>> -      MarkersFound++;
>>> -
>>> -      const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
>>> -      if (Allocation) {
>>> -        DEBUG(dbgs()<<"Found a lifetime marker for slot #"<<Slot<<
>>> -              " with allocation: "<< Allocation->getName()<<"\n");
>>> -      }
>>> -
>>> -      if (IsStart) {
>>> -        BlockInfo.Begin.set(Slot);
>>> -      } else {
>>> -        if (BlockInfo.Begin.test(Slot)) {
>>> -          // Allocas that start and end within a single block are
>>> handled
>>> -          // specially when computing the LiveIntervals to avoid
>>> pessimizing
>>> -          // the liveness propagation.
>>> -          BlockInfo.Begin.reset(Slot);
>>> -        } else {
>>> +      bool isStart = false;
>>> +      slots.clear();
>>> +      if (isLifetimeStartOrEnd(MI, slots, isStart)) {
>>> +        if (!isStart) {
>>> +          assert(slots.size() == 1 && "unexpected: MI ends multiple
>>> slots");
>>> +          int Slot = slots[0];
>>> +          if (BlockInfo.Begin.test(Slot)) {
>>> +            BlockInfo.Begin.reset(Slot);
>>> +          }
>>>            BlockInfo.End.set(Slot);
>>> +        } else {
>>> +          for (auto Slot : slots) {
>>> +            DEBUG(dbgs() << "Found a use of slot #" << Slot);
>>> +            DEBUG(dbgs() << " at BB#" << MBB->getNumber() << " index ");
>>> +            DEBUG(Indexes->getInstructionIndex(MI).print(dbgs()));
>>> +            const AllocaInst *Allocation =
>>> MFI->getObjectAllocation(Slot);
>>> +            if (Allocation) {
>>> +              DEBUG(dbgs() << " with allocation: "<<
>>> Allocation->getName());
>>> +            }
>>> +            DEBUG(dbgs() << "\n");
>>> +            if (BlockInfo.End.test(Slot)) {
>>> +              BlockInfo.End.reset(Slot);
>>> +            }
>>> +            BlockInfo.Begin.set(Slot);
>>> +          }
>>>          }
>>>        }
>>>      }
>>> @@ -287,90 +609,56 @@ unsigned StackColoring::collectMarkers(u
>>>    return MarkersFound;
>>>  }
>>>
>>> -void StackColoring::calculateLocalLiveness() {
>>> -  // Perform a standard reverse dataflow computation to solve for
>>> -  // global liveness.  The BEGIN set here is equivalent to KILL in the
>>> standard
>>> -  // formulation, and END is equivalent to GEN.  The result of this
>>> computation
>>> -  // is a map from blocks to bitvectors where the bitvectors represent
>>> which
>>> -  // allocas are live in/out of that block.
>>> -  SmallPtrSet<const MachineBasicBlock*, 8>
>>> BBSet(BasicBlockNumbering.begin(),
>>> -
>>>  BasicBlockNumbering.end());
>>> -  unsigned NumSSMIters = 0;
>>> +void StackColoring::calculateLocalLiveness()
>>> +{
>>> +  unsigned NumIters = 0;
>>>    bool changed = true;
>>>    while (changed) {
>>>      changed = false;
>>> -    ++NumSSMIters;
>>> -
>>> -    SmallPtrSet<const MachineBasicBlock*, 8> NextBBSet;
>>> +    ++NumIters;
>>>
>>>      for (const MachineBasicBlock *BB : BasicBlockNumbering) {
>>> -      if (!BBSet.count(BB)) continue;
>>>
>>>        // Use an iterator to avoid repeated lookups.
>>>        LivenessMap::iterator BI = BlockLiveness.find(BB);
>>>        assert(BI != BlockLiveness.end() && "Block not found");
>>>        BlockLifetimeInfo &BlockInfo = BI->second;
>>>
>>> +      // Compute LiveIn by unioning together the LiveOut sets of all
>>> preds.
>>>        BitVector LocalLiveIn;
>>> -      BitVector LocalLiveOut;
>>> -
>>> -      // Forward propagation from begins to ends.
>>>        for (MachineBasicBlock::const_pred_iterator PI = BB->pred_begin(),
>>>             PE = BB->pred_end(); PI != PE; ++PI) {
>>>          LivenessMap::const_iterator I = BlockLiveness.find(*PI);
>>>          assert(I != BlockLiveness.end() && "Predecessor not found");
>>>          LocalLiveIn |= I->second.LiveOut;
>>>        }
>>> -      LocalLiveIn |= BlockInfo.End;
>>> -      LocalLiveIn.reset(BlockInfo.Begin);
>>> -
>>> -      // Reverse propagation from ends to begins.
>>> -      for (MachineBasicBlock::const_succ_iterator SI = BB->succ_begin(),
>>> -           SE = BB->succ_end(); SI != SE; ++SI) {
>>> -        LivenessMap::const_iterator I = BlockLiveness.find(*SI);
>>> -        assert(I != BlockLiveness.end() && "Successor not found");
>>> -        LocalLiveOut |= I->second.LiveIn;
>>> -      }
>>> -      LocalLiveOut |= BlockInfo.Begin;
>>> -      LocalLiveOut.reset(BlockInfo.End);
>>> -
>>> -      LocalLiveIn |= LocalLiveOut;
>>> -      LocalLiveOut |= LocalLiveIn;
>>>
>>> -      // After adopting the live bits, we need to turn-off the bits
>>> which
>>> -      // are de-activated in this block.
>>> +      // Compute LiveOut by subtracting out lifetimes that end in this
>>> +      // block, then adding in lifetimes that begin in this block.  If
>>> +      // we have both BEGIN and END markers in the same basic block
>>> +      // then we know that the BEGIN marker comes after the END,
>>> +      // because we already handle the case where the BEGIN comes
>>> +      // before the END when collecting the markers (and building the
>>> +      // BEGIN/END vectors).
>>> +      BitVector LocalLiveOut = LocalLiveIn;
>>>        LocalLiveOut.reset(BlockInfo.End);
>>> -      LocalLiveIn.reset(BlockInfo.Begin);
>>> -
>>> -      // If we have both BEGIN and END markers in the same basic block
>>> then
>>> -      // we know that the BEGIN marker comes after the END, because we
>>> already
>>> -      // handle the case where the BEGIN comes before the END when
>>> collecting
>>> -      // the markers (and building the BEGIN/END vectore).
>>> -      // Want to enable the LIVE_IN and LIVE_OUT of slots that have both
>>> -      // BEGIN and END because it means that the value lives before and
>>> after
>>> -      // this basic block.
>>> -      BitVector LocalEndBegin = BlockInfo.End;
>>> -      LocalEndBegin &= BlockInfo.Begin;
>>> -      LocalLiveIn |= LocalEndBegin;
>>> -      LocalLiveOut |= LocalEndBegin;
>>> +      LocalLiveOut |= BlockInfo.Begin;
>>>
>>> +      // Update block LiveIn set, noting whether it has changed.
>>>        if (LocalLiveIn.test(BlockInfo.LiveIn)) {
>>>          changed = true;
>>>          BlockInfo.LiveIn |= LocalLiveIn;
>>> -
>>> -        NextBBSet.insert(BB->pred_begin(), BB->pred_end());
>>>        }
>>>
>>> +      // Update block LiveOut set, noting whether it has changed.
>>>        if (LocalLiveOut.test(BlockInfo.LiveOut)) {
>>>          changed = true;
>>>          BlockInfo.LiveOut |= LocalLiveOut;
>>> -
>>> -        NextBBSet.insert(BB->succ_begin(), BB->succ_end());
>>>        }
>>>      }
>>> -
>>> -    BBSet = std::move(NextBBSet);
>>>    }// while changed.
>>> +
>>> +  NumIterations = NumIters;
>>>  }
>>>
>>>  void StackColoring::calculateLiveIntervals(unsigned NumSlots) {
>>> @@ -385,29 +673,22 @@ void StackColoring::calculateLiveInterva
>>>      Finishes.clear();
>>>      Finishes.resize(NumSlots);
>>>
>>> -    // Create the interval for the basic blocks with lifetime markers
>>> in them.
>>> -    for (const MachineInstr *MI : Markers) {
>>> -      if (MI->getParent() != &MBB)
>>> -        continue;
>>> +    // Create the interval for the basic blocks containing lifetime
>>> begin/end.
>>> +    for (const MachineInstr &MI : MBB) {
>>>
>>> -      assert((MI->getOpcode() == TargetOpcode::LIFETIME_START ||
>>> -              MI->getOpcode() == TargetOpcode::LIFETIME_END) &&
>>> -             "Invalid Lifetime marker");
>>> -
>>> -      bool IsStart = MI->getOpcode() == TargetOpcode::LIFETIME_START;
>>> -      const MachineOperand &Mo = MI->getOperand(0);
>>> -      int Slot = Mo.getIndex();
>>> -      if (Slot < 0)
>>> +      SmallVector<int, 4> slots;
>>> +      bool IsStart = false;
>>> +      if (!isLifetimeStartOrEnd(MI, slots, IsStart))
>>>          continue;
>>> -
>>> -      SlotIndex ThisIndex = Indexes->getInstructionIndex(*MI);
>>> -
>>> -      if (IsStart) {
>>> -        if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
>>> -          Starts[Slot] = ThisIndex;
>>> -      } else {
>>> -        if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
>>> -          Finishes[Slot] = ThisIndex;
>>> +      SlotIndex ThisIndex = Indexes->getInstructionIndex(MI);
>>> +      for (auto Slot : slots) {
>>> +        if (IsStart) {
>>> +          if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
>>> +            Starts[Slot] = ThisIndex;
>>> +        } else {
>>> +          if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
>>> +            Finishes[Slot] = ThisIndex;
>>> +        }
>>>        }
>>>      }
>>>
>>> @@ -423,7 +704,29 @@ void StackColoring::calculateLiveInterva
>>>      }
>>>
>>>      for (unsigned i = 0; i < NumSlots; ++i) {
>>> -      assert(Starts[i].isValid() == Finishes[i].isValid() && "Unmatched
>>> range");
>>> +      //
>>> +      // When LifetimeStartOnFirstUse is turned on, data flow analysis
>>> +      // is forward (from starts to ends), not bidirectional. A
>>> +      // consequence of this is that we can wind up in situations
>>> +      // where Starts[i] is invalid but Finishes[i] is valid and vice
>>> +      // versa. Example:
>>> +      //
>>> +      //     LIFETIME_START x
>>> +      //     if (...) {
>>> +      //       <use of x>
>>> +      //       throw ...;
>>> +      //     }
>>> +      //     LIFETIME_END x
>>> +      //     return 2;
>>> +      //
>>> +      //
>>> +      // Here the slot for "x" will not be live into the block
>>> +      // containing the "return 2" (since lifetimes start with first
>>> +      // use, not at the dominating LIFETIME_START marker).
>>> +      //
>>> +      if (Starts[i].isValid() && !Finishes[i].isValid()) {
>>> +        Finishes[i] = Indexes->getMBBEndIdx(&MBB);
>>> +      }
>>>        if (!Starts[i].isValid())
>>>          continue;
>>>
>>> @@ -684,7 +987,6 @@ bool StackColoring::runOnMachineFunction
>>>      return false;
>>>
>>>    SmallVector<int, 8> SortedSlots;
>>> -
>>>    SortedSlots.reserve(NumSlots);
>>>    Intervals.reserve(NumSlots);
>>>
>>> @@ -717,9 +1019,12 @@ bool StackColoring::runOnMachineFunction
>>>
>>>    // Calculate the liveness of each block.
>>>    calculateLocalLiveness();
>>> +  DEBUG(dbgs() << "Dataflow iterations: " << NumIterations << "\n");
>>> +  DEBUG(dump());
>>>
>>>    // Propagate the liveness information.
>>>    calculateLiveIntervals(NumSlots);
>>> +  DEBUG(dumpIntervals());
>>>
>>>    // Search for allocas which are used outside of the declared lifetime
>>>    // markers.
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/StackColoring.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/StackColoring.ll?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/StackColoring.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/StackColoring.ll Tue May 24 08:23:44 2016
>>> @@ -87,7 +87,7 @@ bb3:
>>>  }
>>>
>>>  ;CHECK-LABEL: myCall_w4:
>>> -;YESCOLOR: subq  $200, %rsp
>>> +;YESCOLOR: subq  $120, %rsp
>>>  ;NOCOLOR: subq  $408, %rsp
>>>
>>>  define i32 @myCall_w4(i32 %in) {
>>> @@ -217,7 +217,7 @@ bb3:
>>>
>>>
>>>  ;CHECK-LABEL: myCall2_nostart:
>>> -;YESCOLOR: subq  $144, %rsp
>>> +;YESCOLOR: subq  $272, %rsp
>>>  ;NOCOLOR: subq  $272, %rsp
>>>  define i32 @myCall2_nostart(i32 %in, i1 %d) {
>>>  entry:
>>> @@ -425,6 +425,120 @@ define i32 @shady_range(i32 %argc, i8**
>>>    ret i32 9
>>>  }
>>>
>>> +; In this case 'itar1' and 'itar2' can't be overlapped if we treat
>>> +; lifetime.start as the beginning of the lifetime, but we can
>>> +; overlap if we consider first use of the slot as lifetime
>>> +; start. See llvm bug 25776.
>>> +
>>> +;CHECK-LABEL: ifthen_twoslots:
>>> +;YESCOLOR: subq  $536, %rsp
>>> +;NOCOLOR: subq  $1048, %rsp
>>> +
>>> +define i32 @ifthen_twoslots(i32 %x) #0 {
>>> +entry:
>>> +  %retval = alloca i32, align 4
>>> +  %x.addr = alloca i32, align 4
>>> +  %itar1 = alloca [128 x i32], align 16
>>> +  %itar2 = alloca [128 x i32], align 16
>>> +  %cleanup.dest.slot = alloca i32
>>> +  store i32 %x, i32* %x.addr, align 4
>>> +  %itar1_start_8 = bitcast [128 x i32]* %itar1 to i8*
>>> +  call void @llvm.lifetime.start(i64 512, i8* %itar1_start_8) #3
>>> +  %itar2_start_8 = bitcast [128 x i32]* %itar2 to i8*
>>> +  call void @llvm.lifetime.start(i64 512, i8* %itar2_start_8) #3
>>> +  %xval = load i32, i32* %x.addr, align 4
>>> +  %and = and i32 %xval, 1
>>> +  %tobool = icmp ne i32 %and, 0
>>> +  br i1 %tobool, label %if.then, label %if.else
>>> +
>>> +if.then:                                          ; preds = %entry
>>> +  %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]*
>>> %itar1, i32 0, i32 0
>>> +  call void @inita(i32* %arraydecay)
>>> +  store i32 1, i32* %retval, align 4
>>> +  store i32 1, i32* %cleanup.dest.slot, align 4
>>> +  %itar2_end_8 = bitcast [128 x i32]* %itar2 to i8*
>>> +  call void @llvm.lifetime.end(i64 512, i8* %itar2_end_8) #3
>>> +  %itar1_end_8 = bitcast [128 x i32]* %itar1 to i8*
>>> +  call void @llvm.lifetime.end(i64 512, i8* %itar1_end_8) #3
>>> +  br label %cleanup
>>> +
>>> +if.else:                                          ; preds = %entry
>>> +  %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]*
>>> %itar2, i32 0, i32 0
>>> +  call void @inita(i32* %arraydecay1)
>>> +  store i32 0, i32* %retval, align 4
>>> +  store i32 1, i32* %cleanup.dest.slot, align 4
>>> +  %itar2_end2_8 = bitcast [128 x i32]* %itar2 to i8*
>>> +  call void @llvm.lifetime.end(i64 512, i8* %itar2_end2_8) #3
>>> +  %itar1_end2_8 = bitcast [128 x i32]* %itar1 to i8*
>>> +  call void @llvm.lifetime.end(i64 512, i8* %itar1_end2_8) #3
>>> +  br label %cleanup
>>> +
>>> +cleanup:                                          ; preds = %if.else,
>>> %if.then
>>> +  %final_retval = load i32,
>>> + i32* %retval, align 4
>>> +  ret i32 %final_retval
>>> +}
>>> +
>>> +; This function is intended to test the case where you
>>> +; have a reference to a stack slot that lies outside of
>>> +; the START/END lifetime markers-- the flow analysis
>>> +; should catch this and build the lifetime based on the
>>> +; markers only.
>>> +
>>> +;CHECK-LABEL: while_loop:
>>> +;YESCOLOR: subq  $1032, %rsp
>>> +;NOCOLOR: subq  $1544, %rsp
>>> +
>>> +define i32 @while_loop(i32 %x) #0 {
>>> +entry:
>>> +  %b1 = alloca [128 x i32], align 16
>>> +  %b2 = alloca [128 x i32], align 16
>>> +  %b3 = alloca [128 x i32], align 16
>>> +  %tmp = bitcast [128 x i32]* %b1 to i8*
>>> +  call void @llvm.lifetime.start(i64 512, i8* %tmp) #3
>>> +  %tmp1 = bitcast [128 x i32]* %b2 to i8*
>>> +  call void @llvm.lifetime.start(i64 512, i8* %tmp1) #3
>>> +  %and = and i32 %x, 1
>>> +  %tobool = icmp eq i32 %and, 0
>>> +  br i1 %tobool, label %if.else, label %if.then
>>> +
>>> +if.then:                                          ; preds = %entry
>>> +  %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %b2,
>>> i64 0, i64 0
>>> +  call void @inita(i32* %arraydecay) #3
>>> +  br label %if.end
>>> +
>>> +if.else:                                          ; preds = %entry
>>> +  %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %b1,
>>> i64 0, i64 0
>>> +  call void @inita(i32* %arraydecay1) #3
>>> +  %arraydecay3 = getelementptr inbounds [128 x i32], [128 x i32]* %b3,
>>> i64 0, i64 0
>>> +  call void @inita(i32* %arraydecay3) #3
>>> +  %tobool25 = icmp eq i32 %x, 0
>>> +  br i1 %tobool25, label %if.end, label %while.body.lr.ph
>>> +
>>> +while.body.lr.ph:                                 ; preds = %if.else
>>> +  %tmp2 = bitcast [128 x i32]* %b3 to i8*
>>> +  br label %while.body
>>> +
>>> +while.body:                                       ; preds = %
>>> while.body.lr.ph, %while.body
>>> +  %x.addr.06 = phi i32 [ %x, %while.body.lr.ph ], [ %dec, %while.body ]
>>> +  %dec = add nsw i32 %x.addr.06, -1
>>> +  call void @llvm.lifetime.start(i64 512, i8* %tmp2) #3
>>> +  call void @inita(i32* %arraydecay3) #3
>>> +  call void @llvm.lifetime.end(i64 512, i8* %tmp2) #3
>>> +  %tobool2 = icmp eq i32 %dec, 0
>>> +  br i1 %tobool2, label %if.end.loopexit, label %while.body
>>> +
>>> +if.end.loopexit:                                  ; preds = %while.body
>>> +  br label %if.end
>>> +
>>> +if.end:                                           ; preds =
>>> %if.end.loopexit, %if.else, %if.then
>>> +  call void @llvm.lifetime.end(i64 512, i8* %tmp1) #3
>>> +  call void @llvm.lifetime.end(i64 512, i8* %tmp) #3
>>> +  ret i32 0
>>> +}
>>> +
>>> +declare void @inita(i32*) #2
>>> +
>>>  declare void @bar([100 x i32]* , [100 x i32]*) nounwind
>>>
>>>  declare void @llvm.lifetime.start(i64, i8* nocapture) nounwind
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll?rev=270559&r1=270558&r2=270559&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll Tue May 24
>>> 08:23:44 2016
>>> @@ -155,6 +155,7 @@ entry:
>>>    %ref.tmp.i = alloca
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199", align 8
>>>    %Op.i = alloca %"class.llvm::SDValue.3.603.963.1923.2043.2283.4083",
>>> align 8
>>>    %0 = bitcast
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i to i8*
>>> +  call void @llvm.lifetime.start(i64 24, i8* %0) #1
>>>    %retval.sroa.0.0.idx.i36 = getelementptr inbounds
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199",
>>> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i, i64
>>> 0, i32 1, i32 0, i32 0
>>>    %retval.sroa.0.0.copyload.i37 = load i32, i32*
>>> %retval.sroa.0.0.idx.i36, align 8
>>>    call void @llvm.lifetime.end(i64 24, i8* %0) #1
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413
>>
>
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160525/38e54398/attachment.html>


More information about the llvm-commits mailing list