[llvm-commits] [PATCH] Stack Coloring optimization

Chandler Carruth chandlerc at google.com
Tue Sep 4 13:41:23 PDT 2012


Hey Nadav, super excited about this.

Recently, we tried to roll out some significant improvements to GCC's stack
slot re-use, and discovered a really terrifying number of bugs in user code
stemming from this. I would like to make sure that when using tools like
ASan, this logic is disabled, and ASan can correctly detect code using an
alloca after its lifetime has ended. Without this, it will be very hard to
track down mis-compiles introduced here. Adding Kostya as I'm not familiar
with the exact current state of ASan and lifetime of alloca instrumentation.


Also, some code comments inline:

Index: test/CodeGen/X86/StackColoring.ll
===================================================================
--- test/CodeGen/X86/StackColoring.ll (revision 0)
+++ test/CodeGen/X86/StackColoring.ll (revision 0)
@@ -0,0 +1,132 @@
+; RUN: llc < %s                    | FileCheck %s --check-prefix=COLOR
+; RUN: llc < %s -no-stack-coloring | FileCheck %s --check-prefix=NOCOLOR
+
+target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-apple-macosx10.8.0"
+
+;COLOR: subq  $136, %rsp
+;COLOR: subq  $144, %rsp
+;COLOR: subq  $208, %rsp
+;COLOR: subq  $112, %rsp
+
+;NOCOLOR: subq  $264, %rsp
+;NOCOLOR: subq  $272, %rsp
+;NOCOLOR: subq  $400, %rsp
+;NOCOLOR: subq  $400, %rsp

Can you put the filecheck assertions inside the functions in question, and
add anchoring assertions for the function name? That makes the test easier
for me to read, debug, and update.

Index: include/llvm/CodeGen/ISDOpcodes.h
===================================================================
--- include/llvm/CodeGen/ISDOpcodes.h (revision 162761)
+++ include/llvm/CodeGen/ISDOpcodes.h (working copy)
@@ -637,6 +637,10 @@
     ATOMIC_LOAD_UMIN,
     ATOMIC_LOAD_UMAX,

+    /// This corresponds to the LIFETIME intrinsics. The first operand

I would write this ".. to the @llvm.lifetime intrinsics. ..." or otherwise
differ from the style of the ISD node enumerators...

+    /// is the chain and the second operand is the alloca pointer.
+    LIFETIME_START, LIFETIME_END,
+


Index: lib/CodeGen/StackColoring.cpp
===================================================================
--- lib/CodeGen/StackColoring.cpp (revision 0)
+++ lib/CodeGen/StackColoring.cpp (revision 0)
@@ -0,0 +1,593 @@
+//===-- StackColoring.cpp --------------------------------------------
----===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass implements the stack-coloring optimization that looks for
+// lifetime markers machine instructions (LIFESTART_BEGIN and
LIFESTART_END),
+// which represent the possible lifetime of stack slots. It attempts to
+// merge disjoint stack slots and reduce the used stack space.
+// NOTE: This pass is not StackSlotColoring, which optimizes spill slots.

Can you add some notes here about the future work? Specifically as it
relates to the other pass.

Also, would it be possible to rename one or the other pass to make the
differences between them more obvious to the reader, and not require this
note? (clearly as a follow-up or preparatory commit)

+namespace {
+class StackColoring : public MachineFunctionPass {

Doxygen comments here and throughout please.

+  MachineFrameInfo *MFI;
+  MachineFunction *MF;
+
+  typedef DenseMap<MachineBasicBlock*, BitVector> SlotsPerBlock;
+
+  // Holds a list of active slots per basic block.
+  SlotsPerBlock BEGIN;
+  SlotsPerBlock END;
+  SlotsPerBlock LIVE_IN;
+  SlotsPerBlock LIVE_OUT;

These should definitely have very explanatory comments. Also, as this is
new code, please follow the coding convention for naming and make these
CamelCase.

I'm surprised at having four separate dense maps here. What's the rationale
behind that? My (admittedly naive, so your explanation may be all that I
need) base line assumption would be that this would either call for a
single map from the block pointer to a struct of four BitVectors, or that a
better solution would be to put both a flag (indicating begin, end,
live_in, and live_out) and the block pointer in the key and have a single
map. The former seems most appropriate if the expectation is that
essentially every block will have all four of these, and that they are all
four accessed equally frequently. The latter would seem best if often not
all four of these are present for any particular block. What you have only
really makes sense to me if we always have all four, but one of these is
accessed *much* more often than the others, and thus we want greater
locality for that densemap? Anyways, a comment about this would help a lot
I suspect, regardless of the final representation.

Also, how large are these bitvectors? I worry a bit about how happily
DenseMap will copy the values.

+
+  // Mapping between basic blocks and a serial number.
+  DenseMap<MachineBasicBlock*, int> BasicBlocks;
+  SmallVector<MachineBasicBlock*, 8> BasicBlockNumbering;
+
+  // The intervals to describe the active ranges for each slot.
+  SmallVector<LiveInterval*, 16> Intervals;
+  VNInfo::Allocator VNInfoAllocator;
+  SlotIndexes* Indexes;
+
+  // The list of lifetime markers found.
+  SmallVector<MachineInstr*, 8> Markers;
+
+  struct SlotSizeSorter {

It's not at all clear to me how this sorter is used... doxygen comments
will likely help here.

+    MachineFrameInfo *MFI;
+    SlotSizeSorter(MachineFrameInfo *M) : MFI(M) { }

Prevailing style uses the exact same name as the member for the
initializing parameter.

+    bool operator()(int LHS, int RHS) {
+      if (LHS == -1) return 0;
+      if (RHS == -1) return 1;

0 and 1? How about false and true?

Is this still a SWO? Should these just be asserts?

+      size_t LSize = MFI->getObjectSize(LHS);
+      size_t RSize = MFI->getObjectSize(RHS);
+      return LSize > RSize;

Why the variables?

+  }
+};

The indent for these two curlies is wrong.

+private:
+  /// Debug.
+  void dump();
+
+  /// Removes all of the lifetime marker instructions from the function.

How do you feel about putting all of the doxygen comments at the
implementation site since this isn't part of the interface, and just a
private helper function?

+  bool RemoveAllMarkers();

Please use the new convention (camelCase) for method names throughout.

+
+  /// Scan the machine function and find all of the lifetime markers.
+  /// Record the findings in the BEGIN and END vectors.

We don't have BEGIN and END vectors, we have dense maps of bitvectors?
Maybe this just needs more clarification in the datastructure represented
by those vectors.

+  /// returns the number of markers found.

Use doxygen \returns?

+  unsigned CollectMarkers(unsigned NumSlot);
+
+  /// Perform the dataflow calculation and calculate the lifetime for each
of
+  /// the slots, based on the BEGIN/END vectors. Set the LIVE_IN and
LIVE_OUT
+  /// vectors that represent which stack slots are live coming in and out
+  // blocks.
+  void CalculateLocalLiveness();
+
+  /// Construct the LiveIntervals for the slots.
+  void CalculateLiveIntervals(unsigned NumSlots);
+
+  /// Go over the machine function and change instructions which use stack
+  /// slots to use the joint slots.
+  void RemapInstructions(DenseMap<int, int> &SlotRemap);
+
+  /// Map entries which point to other entries to their destination.
+  ///   A->B->C becomes A->C.
+   void ExpungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
+};
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120904/7631537e/attachment.html>


More information about the llvm-commits mailing list