[llvm] r270559 - Rework/enhance stack coloring data flow analysis.

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Wed May 25 16:32:01 PDT 2016


Hi Than,

I started getting failures running llvm-tblgen yesterday in a ThinLTO
bootstrap of clang. I just bisected it to this commit.

I'll work on packaging up a reproducer (a ThinLTO bootstrap using gold
requires D20559 to fix an issue handling archives, which I should be
submitting soon since it just got approved). I had earlier today tracked
this down to happening when we import a particular function, presumably
exposed with the expanded optimization scope after the importing.

I wanted to give you a heads up though in case any other issues have been
reported.

Thanks
Teresa

On Tue, May 24, 2016 at 6:23 AM, Than McIntosh via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: thanm
> Date: Tue May 24 08:23:44 2016
> New Revision: 270559
>
> URL: http://llvm.org/viewvc/llvm-project?rev=270559&view=rev
> Log:
> Rework/enhance stack coloring data flow analysis.
>
> Replace bidirectional flow analysis to compute liveness with forward
> analysis pass. Treat lifetimes as starting when there is a first
> reference to the stack slot, as opposed to starting at the point of the
> lifetime.start intrinsic, so as to increase the number of stack
> variables we can overlap.
>
> Reviewers: gbiv, qcolumbet, wmi
> Differential Revision: http://reviews.llvm.org/D18827
>
> Bug: 25776
>
> Modified:
>     llvm/trunk/lib/CodeGen/StackColoring.cpp
>     llvm/trunk/test/CodeGen/X86/StackColoring.ll
>     llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
>
> Modified: llvm/trunk/lib/CodeGen/StackColoring.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/StackColoring.cpp?rev=270559&r1=270558&r2=270559&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/StackColoring.cpp (original)
> +++ llvm/trunk/lib/CodeGen/StackColoring.cpp Tue May 24 08:23:44 2016
> @@ -64,18 +64,180 @@ DisableColoring("no-stack-coloring",
>  /// The user may write code that uses allocas outside of the declared
> lifetime
>  /// zone. This can happen when the user returns a reference to a local
>  /// data-structure. We can detect these cases and decide not to optimize
> the
> -/// code. If this flag is enabled, we try to save the user.
> +/// code. If this flag is enabled, we try to save the user. This option
> +/// is treated as overriding LifetimeStartOnFirstUse below.
>  static cl::opt<bool>
>  ProtectFromEscapedAllocas("protect-from-escaped-allocas",
>                            cl::init(false), cl::Hidden,
>                            cl::desc("Do not optimize lifetime zones that "
>                                     "are broken"));
>
> +/// Enable enhanced dataflow scheme for lifetime analysis (treat first
> +/// use of stack slot as start of slot lifetime, as opposed to looking
> +/// for LIFETIME_START marker). See "Implementation notes" below for
> +/// more info.
> +static cl::opt<bool>
> +LifetimeStartOnFirstUse("stackcoloring-lifetime-start-on-first-use",
> +        cl::init(true), cl::Hidden,
> +        cl::desc("Treat stack lifetimes as starting on first use, not on
> START marker."));
> +
> +
>  STATISTIC(NumMarkerSeen,  "Number of lifetime markers found.");
>  STATISTIC(StackSpaceSaved, "Number of bytes saved due to merging slots.");
>  STATISTIC(StackSlotMerged, "Number of stack slot merged.");
>  STATISTIC(EscapedAllocas, "Number of allocas that escaped the lifetime
> region");
>
> +//
> +// Implementation Notes:
> +// ---------------------
> +//
> +// Consider the following motivating example:
> +//
> +//     int foo() {
> +//       char b1[1024], b2[1024];
> +//       if (...) {
> +//         char b3[1024];
> +//         <uses of b1, b3>;
> +//         return x;
> +//       } else {
> +//         char b4[1024], b5[1024];
> +//         <uses of b2, b4, b5>;
> +//         return y;
> +//       }
> +//     }
> +//
> +// In the code above, "b3" and "b4" are declared in distinct lexical
> +// scopes, meaning that it is easy to prove that they can share the
> +// same stack slot. Variables "b1" and "b2" are declared in the same
> +// scope, meaning that from a lexical point of view, their lifetimes
> +// overlap. From a control flow pointer of view, however, the two
> +// variables are accessed in disjoint regions of the CFG, thus it
> +// should be possible for them to share the same stack slot. An ideal
> +// stack allocation for the function above would look like:
> +//
> +//     slot 0: b1, b2
> +//     slot 1: b3, b4
> +//     slot 2: b5
> +//
> +// Achieving this allocation is tricky, however, due to the way
> +// lifetime markers are inserted. Here is a simplified view of the
> +// control flow graph for the code above:
> +//
> +//                +------  block 0 -------+
> +//               0| LIFETIME_START b1, b2 |
> +//               1| <test 'if' condition> |
> +//                +-----------------------+
> +//                   ./              \.
> +//   +------  block 1 -------+   +------  block 2 -------+
> +//  2| LIFETIME_START b3     |  5| LIFETIME_START b4, b5 |
> +//  3| <uses of b1, b3>      |  6| <uses of b2, b4, b5>  |
> +//  4| LIFETIME_END b3       |  7| LIFETIME_END b4, b5   |
> +//   +-----------------------+   +-----------------------+
> +//                   \.              /.
> +//                +------  block 3 -------+
> +//               8| <cleanupcode>         |
> +//               9| LIFETIME_END b1, b2   |
> +//              10| return                |
> +//                +-----------------------+
> +//
> +// If we create live intervals for the variables above strictly based
> +// on the lifetime markers, we'll get the set of intervals on the
> +// left. If we ignore the lifetime start markers and instead treat a
> +// variable's lifetime as beginning with the first reference to the
> +// var, then we get the intervals on the right.
> +//
> +//            LIFETIME_START      First Use
> +//     b1:    [0,9]               [3,4] [8,9]
> +//     b2:    [0,9]               [6,9]
> +//     b3:    [2,4]               [3,4]
> +//     b4:    [5,7]               [6,7]
> +//     b5:    [5,7]               [6,7]
> +//
> +// For the intervals on the left, the best we can do is overlap two
> +// variables (b3 and b4, for example); this gives us a stack size of
> +// 4*1024 bytes, not ideal. When treating first-use as the start of a
> +// lifetime, we can additionally overlap b1 and b5, giving us a 3*1024
> +// byte stack (better).
> +//
> +// Relying entirely on first-use of stack slots is problematic,
> +// however, due to the fact that optimizations can sometimes migrate
> +// uses of a variable outside of its lifetime start/end region. Here
> +// is an example:
> +//
> +//     int bar() {
> +//       char b1[1024], b2[1024];
> +//       if (...) {
> +//         <uses of b2>
> +//         return y;
> +//       } else {
> +//         <uses of b1>
> +//         while (...) {
> +//           char b3[1024];
> +//           <uses of b3>
> +//         }
> +//       }
> +//     }
> +//
> +// Before optimization, the control flow graph for the code above
> +// might look like the following:
> +//
> +//                +------  block 0 -------+
> +//               0| LIFETIME_START b1, b2 |
> +//               1| <test 'if' condition> |
> +//                +-----------------------+
> +//                   ./              \.
> +//   +------  block 1 -------+    +------- block 2 -------+
> +//  2| <uses of b2>          |   3| <uses of b1>          |
> +//   +-----------------------+    +-----------------------+
> +//              |                            |
> +//              |                 +------- block 3 -------+ <-\.
> +//              |                4| <while condition>     |    |
> +//              |                 +-----------------------+    |
> +//              |               /          |                   |
> +//              |              /  +------- block 4 -------+
> +//              \             /  5| LIFETIME_START b3     |    |
> +//               \           /   6| <uses of b3>          |    |
> +//                \         /    7| LIFETIME_END b3       |    |
> +//                 \        |    +------------------------+    |
> +//                  \       |                 \                /
> +//                +------  block 5 -----+      \---------------
> +//               8| <cleanupcode>       |
> +//               9| LIFETIME_END b1, b2 |
> +//              10| return              |
> +//                +---------------------+
> +//
> +// During optimization, however, it can happen that an instruction
> +// computing an address in "b3" (for example, a loop-invariant GEP) is
> +// hoisted up out of the loop from block 4 to block 2.  [Note that
> +// this is not an actual load from the stack, only an instruction that
> +// computes the address to be loaded]. If this happens, there is now a
> +// path leading from the first use of b3 to the return instruction
> +// that does not encounter the b3 LIFETIME_END, hence b3's lifetime is
> +// now larger than if we were computing live intervals strictly based
> +// on lifetime markers. In the example above, this lengthened lifetime
> +// would mean that it would appear illegal to overlap b3 with b2.
> +//
> +// To deal with this such cases, the code in ::collectMarkers() below
> +// tries to identify "degenerate" slots -- those slots where on a single
> +// forward pass through the CFG we encounter a first reference to slot
> +// K before we hit the slot K lifetime start marker. For such slots,
> +// we fall back on using the lifetime start marker as the beginning of
> +// the variable's lifetime.  NB: with this implementation, slots can
> +// appear degenerate in cases where there is unstructured control flow:
> +//
> +//    if (q) goto mid;
> +//    if (x > 9) {
> +//         int b[100];
> +//         memcpy(&b[0], ...);
> +//    mid: b[k] = ...;
> +//         abc(&b);
> +//    }
> +//
> +// If in RPO ordering chosen to walk the CFG  we happen to visit the b[k]
> +// before visiting the memcpy block (which will contain the lifetime start
> +// for "b" then it will appear that 'b' has a degenerate lifetime.
> +//
> +
>
>  //===----------------------------------------------------------------------===//
>  //                           StackColoring Pass
>
>  //===----------------------------------------------------------------------===//
> @@ -123,6 +285,17 @@ class StackColoring : public MachineFunc
>    /// once the coloring is done.
>    SmallVector<MachineInstr*, 8> Markers;
>
> +  /// Record the FI slots for which we have seen some sort of
> +  /// lifetime marker (either start or end).
> +  BitVector InterestingSlots;
> +
> +  /// Degenerate slots -- first use appears outside of start/end
> +  /// lifetime markers.
> +  BitVector DegenerateSlots;
> +
> +  /// Number of iterations taken during data flow analysis.
> +  unsigned NumIterations;
> +
>  public:
>    static char ID;
>    StackColoring() : MachineFunctionPass(ID) {
> @@ -153,6 +326,25 @@ private:
>    /// in and out blocks.
>    void calculateLocalLiveness();
>
> +  /// Returns TRUE if we're using the first-use-begins-lifetime method for
> +  /// this slot (if FALSE, then the start marker is treated as start of
> lifetime).
> +  bool applyFirstUse(int Slot) {
> +    if (!LifetimeStartOnFirstUse || ProtectFromEscapedAllocas)
> +      return false;
> +    if (DegenerateSlots.test(Slot))
> +      return false;
> +    return true;
> +  }
> +
> +  /// Examines the specified instruction and returns TRUE if the
> instruction
> +  /// represents the start or end of an interesting lifetime. The slot or
> slots
> +  /// starting or ending are added to the vector "slots" and "isStart" is
> set
> +  /// accordingly.
> +  /// \returns True if inst contains a lifetime start or end
> +  bool isLifetimeStartOrEnd(const MachineInstr &MI,
> +                            SmallVector<int, 4> &slots,
> +                            bool &isStart);
> +
>    /// Construct the LiveIntervals for the slots.
>    void calculateLiveIntervals(unsigned NumSlots);
>
> @@ -170,7 +362,10 @@ private:
>
>    /// Map entries which point to other entries to their destination.
>    ///   A->B->C becomes A->C.
> -   void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
> +  void expungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
> +
> +  /// Used in collectMarkers
> +  typedef DenseMap<const MachineBasicBlock*, BitVector> BlockBitVecMap;
>  };
>  } // end anonymous namespace
>
> @@ -228,9 +423,140 @@ LLVM_DUMP_METHOD void StackColoring::dum
>
>  #endif // not NDEBUG
>
> -unsigned StackColoring::collectMarkers(unsigned NumSlot) {
> +static inline int getStartOrEndSlot(const MachineInstr &MI)
> +{
> +  assert((MI.getOpcode() == TargetOpcode::LIFETIME_START ||
> +          MI.getOpcode() == TargetOpcode::LIFETIME_END) &&
> +         "Expected LIFETIME_START or LIFETIME_END op");
> +  const MachineOperand &MO = MI.getOperand(0);
> +  int Slot = MO.getIndex();
> +  if (Slot >= 0)
> +    return Slot;
> +  return -1;
> +}
> +
> +//
> +// At the moment the only way to end a variable lifetime is with
> +// a VARIABLE_LIFETIME op (which can't contain a start). If things
> +// change and the IR allows for a single inst that both begins
> +// and ends lifetime(s), this interface will need to be reworked.
> +//
> +bool StackColoring::isLifetimeStartOrEnd(const MachineInstr &MI,
> +                                         SmallVector<int, 4> &slots,
> +                                         bool &isStart)
> +{
> +  if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
> +      MI.getOpcode() == TargetOpcode::LIFETIME_END) {
> +    int Slot = getStartOrEndSlot(MI);
> +    if (Slot < 0)
> +      return false;
> +    if (!InterestingSlots.test(Slot))
> +      return false;
> +    slots.push_back(Slot);
> +    if (MI.getOpcode() == TargetOpcode::LIFETIME_END) {
> +      isStart = false;
> +      return true;
> +    }
> +    if (! applyFirstUse(Slot)) {
> +      isStart = true;
> +      return true;
> +    }
> +  } else if (LifetimeStartOnFirstUse && !ProtectFromEscapedAllocas) {
> +    if (! MI.isDebugValue()) {
> +      bool found = false;
> +      for (const MachineOperand &MO : MI.operands()) {
> +        if (!MO.isFI())
> +          continue;
> +        int Slot = MO.getIndex();
> +        if (Slot<0)
> +          continue;
> +        if (InterestingSlots.test(Slot) && applyFirstUse(Slot)) {
> +          slots.push_back(Slot);
> +          found = true;
> +        }
> +      }
> +      if (found) {
> +        isStart = true;
> +        return true;
> +      }
> +    }
> +  }
> +  return false;
> +}
> +
> +unsigned StackColoring::collectMarkers(unsigned NumSlot)
> +{
>    unsigned MarkersFound = 0;
> -  // Scan the function to find all lifetime markers.
> +  BlockBitVecMap SeenStartMap;
> +  InterestingSlots.clear();
> +  InterestingSlots.resize(NumSlot);
> +  DegenerateSlots.clear();
> +  DegenerateSlots.resize(NumSlot);
> +
> +  // Step 1: collect markers and populate the "InterestingSlots"
> +  // and "DegenerateSlots" sets.
> +  for (MachineBasicBlock *MBB : depth_first(MF)) {
> +
> +    // Compute the set of slots for which we've seen a START marker but
> have
> +    // not yet seen an END marker at this point in the walk (e.g. on entry
> +    // to this bb).
> +    BitVector BetweenStartEnd;
> +    BetweenStartEnd.resize(NumSlot);
> +    for (MachineBasicBlock::const_pred_iterator PI = MBB->pred_begin(),
> +             PE = MBB->pred_end(); PI != PE; ++PI) {
> +      BlockBitVecMap::const_iterator I = SeenStartMap.find(*PI);
> +      if (I != SeenStartMap.end()) {
> +        BetweenStartEnd |= I->second;
> +      }
> +    }
> +
> +    // Walk the instructions in the block to look for start/end ops.
> +    for (MachineInstr &MI : *MBB) {
> +      if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
> +          MI.getOpcode() == TargetOpcode::LIFETIME_END) {
> +        int Slot = getStartOrEndSlot(MI);
> +        if (Slot < 0)
> +          continue;
> +        InterestingSlots.set(Slot);
> +        if (MI.getOpcode() == TargetOpcode::LIFETIME_START)
> +          BetweenStartEnd.set(Slot);
> +        else
> +          BetweenStartEnd.reset(Slot);
> +        const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
> +        if (Allocation) {
> +          DEBUG(dbgs() << "Found a lifetime ");
> +          DEBUG(dbgs() << (MI.getOpcode() == TargetOpcode::LIFETIME_START
> +                               ? "start"
> +                               : "end"));
> +          DEBUG(dbgs() << " marker for slot #" << Slot);
> +          DEBUG(dbgs() << " with allocation: " << Allocation->getName()
> +                       << "\n");
> +        }
> +        Markers.push_back(&MI);
> +        MarkersFound += 1;
> +      } else {
> +        for (const MachineOperand &MO : MI.operands()) {
> +          if (!MO.isFI())
> +            continue;
> +          int Slot = MO.getIndex();
> +          if (Slot < 0)
> +            continue;
> +          if (! BetweenStartEnd.test(Slot)) {
> +            DegenerateSlots.set(Slot);
> +          }
> +        }
> +      }
> +    }
> +    BitVector &SeenStart = SeenStartMap[MBB];
> +    SeenStart |= BetweenStartEnd;
> +  }
> +  if (!MarkersFound) {
> +    return 0;
> +  }
> +  DEBUG(dumpBV("Degenerate slots", DegenerateSlots));
> +
> +  // Step 2: compute begin/end sets for each block
> +
>    // NOTE: We use a reverse-post-order iteration to ensure that we obtain
> a
>    // deterministic numbering, and because we'll need a post-order
> iteration
>    // later for solving the liveness dataflow problem.
> @@ -246,37 +572,33 @@ unsigned StackColoring::collectMarkers(u
>      BlockInfo.Begin.resize(NumSlot);
>      BlockInfo.End.resize(NumSlot);
>
> +    SmallVector<int, 4> slots;
>      for (MachineInstr &MI : *MBB) {
> -      if (MI.getOpcode() != TargetOpcode::LIFETIME_START &&
> -          MI.getOpcode() != TargetOpcode::LIFETIME_END)
> -        continue;
> -
> -      bool IsStart = MI.getOpcode() == TargetOpcode::LIFETIME_START;
> -      const MachineOperand &MO = MI.getOperand(0);
> -      int Slot = MO.getIndex();
> -      if (Slot < 0)
> -        continue;
> -
> -      Markers.push_back(&MI);
> -
> -      MarkersFound++;
> -
> -      const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
> -      if (Allocation) {
> -        DEBUG(dbgs()<<"Found a lifetime marker for slot #"<<Slot<<
> -              " with allocation: "<< Allocation->getName()<<"\n");
> -      }
> -
> -      if (IsStart) {
> -        BlockInfo.Begin.set(Slot);
> -      } else {
> -        if (BlockInfo.Begin.test(Slot)) {
> -          // Allocas that start and end within a single block are handled
> -          // specially when computing the LiveIntervals to avoid
> pessimizing
> -          // the liveness propagation.
> -          BlockInfo.Begin.reset(Slot);
> -        } else {
> +      bool isStart = false;
> +      slots.clear();
> +      if (isLifetimeStartOrEnd(MI, slots, isStart)) {
> +        if (!isStart) {
> +          assert(slots.size() == 1 && "unexpected: MI ends multiple
> slots");
> +          int Slot = slots[0];
> +          if (BlockInfo.Begin.test(Slot)) {
> +            BlockInfo.Begin.reset(Slot);
> +          }
>            BlockInfo.End.set(Slot);
> +        } else {
> +          for (auto Slot : slots) {
> +            DEBUG(dbgs() << "Found a use of slot #" << Slot);
> +            DEBUG(dbgs() << " at BB#" << MBB->getNumber() << " index ");
> +            DEBUG(Indexes->getInstructionIndex(MI).print(dbgs()));
> +            const AllocaInst *Allocation = MFI->getObjectAllocation(Slot);
> +            if (Allocation) {
> +              DEBUG(dbgs() << " with allocation: "<<
> Allocation->getName());
> +            }
> +            DEBUG(dbgs() << "\n");
> +            if (BlockInfo.End.test(Slot)) {
> +              BlockInfo.End.reset(Slot);
> +            }
> +            BlockInfo.Begin.set(Slot);
> +          }
>          }
>        }
>      }
> @@ -287,90 +609,56 @@ unsigned StackColoring::collectMarkers(u
>    return MarkersFound;
>  }
>
> -void StackColoring::calculateLocalLiveness() {
> -  // Perform a standard reverse dataflow computation to solve for
> -  // global liveness.  The BEGIN set here is equivalent to KILL in the
> standard
> -  // formulation, and END is equivalent to GEN.  The result of this
> computation
> -  // is a map from blocks to bitvectors where the bitvectors represent
> which
> -  // allocas are live in/out of that block.
> -  SmallPtrSet<const MachineBasicBlock*, 8>
> BBSet(BasicBlockNumbering.begin(),
> -
>  BasicBlockNumbering.end());
> -  unsigned NumSSMIters = 0;
> +void StackColoring::calculateLocalLiveness()
> +{
> +  unsigned NumIters = 0;
>    bool changed = true;
>    while (changed) {
>      changed = false;
> -    ++NumSSMIters;
> -
> -    SmallPtrSet<const MachineBasicBlock*, 8> NextBBSet;
> +    ++NumIters;
>
>      for (const MachineBasicBlock *BB : BasicBlockNumbering) {
> -      if (!BBSet.count(BB)) continue;
>
>        // Use an iterator to avoid repeated lookups.
>        LivenessMap::iterator BI = BlockLiveness.find(BB);
>        assert(BI != BlockLiveness.end() && "Block not found");
>        BlockLifetimeInfo &BlockInfo = BI->second;
>
> +      // Compute LiveIn by unioning together the LiveOut sets of all
> preds.
>        BitVector LocalLiveIn;
> -      BitVector LocalLiveOut;
> -
> -      // Forward propagation from begins to ends.
>        for (MachineBasicBlock::const_pred_iterator PI = BB->pred_begin(),
>             PE = BB->pred_end(); PI != PE; ++PI) {
>          LivenessMap::const_iterator I = BlockLiveness.find(*PI);
>          assert(I != BlockLiveness.end() && "Predecessor not found");
>          LocalLiveIn |= I->second.LiveOut;
>        }
> -      LocalLiveIn |= BlockInfo.End;
> -      LocalLiveIn.reset(BlockInfo.Begin);
> -
> -      // Reverse propagation from ends to begins.
> -      for (MachineBasicBlock::const_succ_iterator SI = BB->succ_begin(),
> -           SE = BB->succ_end(); SI != SE; ++SI) {
> -        LivenessMap::const_iterator I = BlockLiveness.find(*SI);
> -        assert(I != BlockLiveness.end() && "Successor not found");
> -        LocalLiveOut |= I->second.LiveIn;
> -      }
> -      LocalLiveOut |= BlockInfo.Begin;
> -      LocalLiveOut.reset(BlockInfo.End);
> -
> -      LocalLiveIn |= LocalLiveOut;
> -      LocalLiveOut |= LocalLiveIn;
>
> -      // After adopting the live bits, we need to turn-off the bits which
> -      // are de-activated in this block.
> +      // Compute LiveOut by subtracting out lifetimes that end in this
> +      // block, then adding in lifetimes that begin in this block.  If
> +      // we have both BEGIN and END markers in the same basic block
> +      // then we know that the BEGIN marker comes after the END,
> +      // because we already handle the case where the BEGIN comes
> +      // before the END when collecting the markers (and building the
> +      // BEGIN/END vectors).
> +      BitVector LocalLiveOut = LocalLiveIn;
>        LocalLiveOut.reset(BlockInfo.End);
> -      LocalLiveIn.reset(BlockInfo.Begin);
> -
> -      // If we have both BEGIN and END markers in the same basic block
> then
> -      // we know that the BEGIN marker comes after the END, because we
> already
> -      // handle the case where the BEGIN comes before the END when
> collecting
> -      // the markers (and building the BEGIN/END vectore).
> -      // Want to enable the LIVE_IN and LIVE_OUT of slots that have both
> -      // BEGIN and END because it means that the value lives before and
> after
> -      // this basic block.
> -      BitVector LocalEndBegin = BlockInfo.End;
> -      LocalEndBegin &= BlockInfo.Begin;
> -      LocalLiveIn |= LocalEndBegin;
> -      LocalLiveOut |= LocalEndBegin;
> +      LocalLiveOut |= BlockInfo.Begin;
>
> +      // Update block LiveIn set, noting whether it has changed.
>        if (LocalLiveIn.test(BlockInfo.LiveIn)) {
>          changed = true;
>          BlockInfo.LiveIn |= LocalLiveIn;
> -
> -        NextBBSet.insert(BB->pred_begin(), BB->pred_end());
>        }
>
> +      // Update block LiveOut set, noting whether it has changed.
>        if (LocalLiveOut.test(BlockInfo.LiveOut)) {
>          changed = true;
>          BlockInfo.LiveOut |= LocalLiveOut;
> -
> -        NextBBSet.insert(BB->succ_begin(), BB->succ_end());
>        }
>      }
> -
> -    BBSet = std::move(NextBBSet);
>    }// while changed.
> +
> +  NumIterations = NumIters;
>  }
>
>  void StackColoring::calculateLiveIntervals(unsigned NumSlots) {
> @@ -385,29 +673,22 @@ void StackColoring::calculateLiveInterva
>      Finishes.clear();
>      Finishes.resize(NumSlots);
>
> -    // Create the interval for the basic blocks with lifetime markers in
> them.
> -    for (const MachineInstr *MI : Markers) {
> -      if (MI->getParent() != &MBB)
> -        continue;
> +    // Create the interval for the basic blocks containing lifetime
> begin/end.
> +    for (const MachineInstr &MI : MBB) {
>
> -      assert((MI->getOpcode() == TargetOpcode::LIFETIME_START ||
> -              MI->getOpcode() == TargetOpcode::LIFETIME_END) &&
> -             "Invalid Lifetime marker");
> -
> -      bool IsStart = MI->getOpcode() == TargetOpcode::LIFETIME_START;
> -      const MachineOperand &Mo = MI->getOperand(0);
> -      int Slot = Mo.getIndex();
> -      if (Slot < 0)
> +      SmallVector<int, 4> slots;
> +      bool IsStart = false;
> +      if (!isLifetimeStartOrEnd(MI, slots, IsStart))
>          continue;
> -
> -      SlotIndex ThisIndex = Indexes->getInstructionIndex(*MI);
> -
> -      if (IsStart) {
> -        if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
> -          Starts[Slot] = ThisIndex;
> -      } else {
> -        if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
> -          Finishes[Slot] = ThisIndex;
> +      SlotIndex ThisIndex = Indexes->getInstructionIndex(MI);
> +      for (auto Slot : slots) {
> +        if (IsStart) {
> +          if (!Starts[Slot].isValid() || Starts[Slot] > ThisIndex)
> +            Starts[Slot] = ThisIndex;
> +        } else {
> +          if (!Finishes[Slot].isValid() || Finishes[Slot] < ThisIndex)
> +            Finishes[Slot] = ThisIndex;
> +        }
>        }
>      }
>
> @@ -423,7 +704,29 @@ void StackColoring::calculateLiveInterva
>      }
>
>      for (unsigned i = 0; i < NumSlots; ++i) {
> -      assert(Starts[i].isValid() == Finishes[i].isValid() && "Unmatched
> range");
> +      //
> +      // When LifetimeStartOnFirstUse is turned on, data flow analysis
> +      // is forward (from starts to ends), not bidirectional. A
> +      // consequence of this is that we can wind up in situations
> +      // where Starts[i] is invalid but Finishes[i] is valid and vice
> +      // versa. Example:
> +      //
> +      //     LIFETIME_START x
> +      //     if (...) {
> +      //       <use of x>
> +      //       throw ...;
> +      //     }
> +      //     LIFETIME_END x
> +      //     return 2;
> +      //
> +      //
> +      // Here the slot for "x" will not be live into the block
> +      // containing the "return 2" (since lifetimes start with first
> +      // use, not at the dominating LIFETIME_START marker).
> +      //
> +      if (Starts[i].isValid() && !Finishes[i].isValid()) {
> +        Finishes[i] = Indexes->getMBBEndIdx(&MBB);
> +      }
>        if (!Starts[i].isValid())
>          continue;
>
> @@ -684,7 +987,6 @@ bool StackColoring::runOnMachineFunction
>      return false;
>
>    SmallVector<int, 8> SortedSlots;
> -
>    SortedSlots.reserve(NumSlots);
>    Intervals.reserve(NumSlots);
>
> @@ -717,9 +1019,12 @@ bool StackColoring::runOnMachineFunction
>
>    // Calculate the liveness of each block.
>    calculateLocalLiveness();
> +  DEBUG(dbgs() << "Dataflow iterations: " << NumIterations << "\n");
> +  DEBUG(dump());
>
>    // Propagate the liveness information.
>    calculateLiveIntervals(NumSlots);
> +  DEBUG(dumpIntervals());
>
>    // Search for allocas which are used outside of the declared lifetime
>    // markers.
>
> Modified: llvm/trunk/test/CodeGen/X86/StackColoring.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/StackColoring.ll?rev=270559&r1=270558&r2=270559&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/StackColoring.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/StackColoring.ll Tue May 24 08:23:44 2016
> @@ -87,7 +87,7 @@ bb3:
>  }
>
>  ;CHECK-LABEL: myCall_w4:
> -;YESCOLOR: subq  $200, %rsp
> +;YESCOLOR: subq  $120, %rsp
>  ;NOCOLOR: subq  $408, %rsp
>
>  define i32 @myCall_w4(i32 %in) {
> @@ -217,7 +217,7 @@ bb3:
>
>
>  ;CHECK-LABEL: myCall2_nostart:
> -;YESCOLOR: subq  $144, %rsp
> +;YESCOLOR: subq  $272, %rsp
>  ;NOCOLOR: subq  $272, %rsp
>  define i32 @myCall2_nostart(i32 %in, i1 %d) {
>  entry:
> @@ -425,6 +425,120 @@ define i32 @shady_range(i32 %argc, i8**
>    ret i32 9
>  }
>
> +; In this case 'itar1' and 'itar2' can't be overlapped if we treat
> +; lifetime.start as the beginning of the lifetime, but we can
> +; overlap if we consider first use of the slot as lifetime
> +; start. See llvm bug 25776.
> +
> +;CHECK-LABEL: ifthen_twoslots:
> +;YESCOLOR: subq  $536, %rsp
> +;NOCOLOR: subq  $1048, %rsp
> +
> +define i32 @ifthen_twoslots(i32 %x) #0 {
> +entry:
> +  %retval = alloca i32, align 4
> +  %x.addr = alloca i32, align 4
> +  %itar1 = alloca [128 x i32], align 16
> +  %itar2 = alloca [128 x i32], align 16
> +  %cleanup.dest.slot = alloca i32
> +  store i32 %x, i32* %x.addr, align 4
> +  %itar1_start_8 = bitcast [128 x i32]* %itar1 to i8*
> +  call void @llvm.lifetime.start(i64 512, i8* %itar1_start_8) #3
> +  %itar2_start_8 = bitcast [128 x i32]* %itar2 to i8*
> +  call void @llvm.lifetime.start(i64 512, i8* %itar2_start_8) #3
> +  %xval = load i32, i32* %x.addr, align 4
> +  %and = and i32 %xval, 1
> +  %tobool = icmp ne i32 %and, 0
> +  br i1 %tobool, label %if.then, label %if.else
> +
> +if.then:                                          ; preds = %entry
> +  %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %itar1,
> i32 0, i32 0
> +  call void @inita(i32* %arraydecay)
> +  store i32 1, i32* %retval, align 4
> +  store i32 1, i32* %cleanup.dest.slot, align 4
> +  %itar2_end_8 = bitcast [128 x i32]* %itar2 to i8*
> +  call void @llvm.lifetime.end(i64 512, i8* %itar2_end_8) #3
> +  %itar1_end_8 = bitcast [128 x i32]* %itar1 to i8*
> +  call void @llvm.lifetime.end(i64 512, i8* %itar1_end_8) #3
> +  br label %cleanup
> +
> +if.else:                                          ; preds = %entry
> +  %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %itar2,
> i32 0, i32 0
> +  call void @inita(i32* %arraydecay1)
> +  store i32 0, i32* %retval, align 4
> +  store i32 1, i32* %cleanup.dest.slot, align 4
> +  %itar2_end2_8 = bitcast [128 x i32]* %itar2 to i8*
> +  call void @llvm.lifetime.end(i64 512, i8* %itar2_end2_8) #3
> +  %itar1_end2_8 = bitcast [128 x i32]* %itar1 to i8*
> +  call void @llvm.lifetime.end(i64 512, i8* %itar1_end2_8) #3
> +  br label %cleanup
> +
> +cleanup:                                          ; preds = %if.else,
> %if.then
> +  %final_retval = load i32,
> + i32* %retval, align 4
> +  ret i32 %final_retval
> +}
> +
> +; This function is intended to test the case where you
> +; have a reference to a stack slot that lies outside of
> +; the START/END lifetime markers-- the flow analysis
> +; should catch this and build the lifetime based on the
> +; markers only.
> +
> +;CHECK-LABEL: while_loop:
> +;YESCOLOR: subq  $1032, %rsp
> +;NOCOLOR: subq  $1544, %rsp
> +
> +define i32 @while_loop(i32 %x) #0 {
> +entry:
> +  %b1 = alloca [128 x i32], align 16
> +  %b2 = alloca [128 x i32], align 16
> +  %b3 = alloca [128 x i32], align 16
> +  %tmp = bitcast [128 x i32]* %b1 to i8*
> +  call void @llvm.lifetime.start(i64 512, i8* %tmp) #3
> +  %tmp1 = bitcast [128 x i32]* %b2 to i8*
> +  call void @llvm.lifetime.start(i64 512, i8* %tmp1) #3
> +  %and = and i32 %x, 1
> +  %tobool = icmp eq i32 %and, 0
> +  br i1 %tobool, label %if.else, label %if.then
> +
> +if.then:                                          ; preds = %entry
> +  %arraydecay = getelementptr inbounds [128 x i32], [128 x i32]* %b2, i64
> 0, i64 0
> +  call void @inita(i32* %arraydecay) #3
> +  br label %if.end
> +
> +if.else:                                          ; preds = %entry
> +  %arraydecay1 = getelementptr inbounds [128 x i32], [128 x i32]* %b1,
> i64 0, i64 0
> +  call void @inita(i32* %arraydecay1) #3
> +  %arraydecay3 = getelementptr inbounds [128 x i32], [128 x i32]* %b3,
> i64 0, i64 0
> +  call void @inita(i32* %arraydecay3) #3
> +  %tobool25 = icmp eq i32 %x, 0
> +  br i1 %tobool25, label %if.end, label %while.body.lr.ph
> +
> +while.body.lr.ph:                                 ; preds = %if.else
> +  %tmp2 = bitcast [128 x i32]* %b3 to i8*
> +  br label %while.body
> +
> +while.body:                                       ; preds = %
> while.body.lr.ph, %while.body
> +  %x.addr.06 = phi i32 [ %x, %while.body.lr.ph ], [ %dec, %while.body ]
> +  %dec = add nsw i32 %x.addr.06, -1
> +  call void @llvm.lifetime.start(i64 512, i8* %tmp2) #3
> +  call void @inita(i32* %arraydecay3) #3
> +  call void @llvm.lifetime.end(i64 512, i8* %tmp2) #3
> +  %tobool2 = icmp eq i32 %dec, 0
> +  br i1 %tobool2, label %if.end.loopexit, label %while.body
> +
> +if.end.loopexit:                                  ; preds = %while.body
> +  br label %if.end
> +
> +if.end:                                           ; preds =
> %if.end.loopexit, %if.else, %if.then
> +  call void @llvm.lifetime.end(i64 512, i8* %tmp1) #3
> +  call void @llvm.lifetime.end(i64 512, i8* %tmp) #3
> +  ret i32 0
> +}
> +
> +declare void @inita(i32*) #2
> +
>  declare void @bar([100 x i32]* , [100 x i32]*) nounwind
>
>  declare void @llvm.lifetime.start(i64, i8* nocapture) nounwind
>
> Modified: llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll?rev=270559&r1=270558&r2=270559&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/misched-aa-colored.ll Tue May 24 08:23:44
> 2016
> @@ -155,6 +155,7 @@ entry:
>    %ref.tmp.i = alloca
> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199", align 8
>    %Op.i = alloca %"class.llvm::SDValue.3.603.963.1923.2043.2283.4083",
> align 8
>    %0 = bitcast %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"*
> %ref.tmp.i to i8*
> +  call void @llvm.lifetime.start(i64 24, i8* %0) #1
>    %retval.sroa.0.0.idx.i36 = getelementptr inbounds
> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199",
> %"struct.std::pair.112.119.719.1079.2039.2159.2399.4199"* %ref.tmp.i, i64
> 0, i32 1, i32 0, i32 0
>    %retval.sroa.0.0.copyload.i37 = load i32, i32*
> %retval.sroa.0.0.idx.i36, align 8
>    call void @llvm.lifetime.end(i64 24, i8* %0) #1
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>



-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160525/48416454/attachment-0001.html>


More information about the llvm-commits mailing list