[llvm-commits] [PATCH] Stack Coloring optimization

Chandler Carruth chandlerc at google.com
Tue Sep 4 14:12:47 PDT 2012


On Tue, Sep 4, 2012 at 4:41 PM, Chandler Carruth <chandlerc at google.com>wrote:

> Hey Nadav, super excited about this.
>
> Recently, we tried to roll out some significant improvements to GCC's
> stack slot re-use, and discovered a really terrifying number of bugs in
> user code stemming from this. I would like to make sure that when using
> tools like ASan, this logic is disabled, and ASan can correctly detect code
> using an alloca after its lifetime has ended. Without this, it will be very
> hard to track down mis-compiles introduced here. Adding Kostya as I'm not
> familiar with the exact current state of ASan and lifetime of alloca
> instrumentation.
>

Having read the discussion more carefully when we hit lots of user bugs
surrounding this, I have more information.

The problem stems from re-use of locals after their lifetime ends. These
won't show up often until we teach Clang to emit the lifetime intrinsics
for those locals. Once we do, your pass will cause a very large amount of
code to blow up. It's a good thing too, as that code is *broken*.

We need to teach ASan to use the lifetime markers to catch
use-after-lifetime-ends bugs first IMO, and *then* enable this pass and the
lifetime markers coming from Clang. That way we will already have the tools
to find these bugs quickly and accurately.

We probably will also need to disable this particular optimization for
functions that have the ASan attribute set.


>
>
> Also, some code comments inline:
>
> Index: test/CodeGen/X86/StackColoring.ll
> ===================================================================
> --- test/CodeGen/X86/StackColoring.ll (revision 0)
> +++ test/CodeGen/X86/StackColoring.ll (revision 0)
> @@ -0,0 +1,132 @@
> +; RUN: llc < %s                    | FileCheck %s --check-prefix=COLOR
> +; RUN: llc < %s -no-stack-coloring | FileCheck %s --check-prefix=NOCOLOR
> +
> +target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
> +target triple = "x86_64-apple-macosx10.8.0"
> +
> +;COLOR: subq  $136, %rsp
> +;COLOR: subq  $144, %rsp
> +;COLOR: subq  $208, %rsp
> +;COLOR: subq  $112, %rsp
> +
> +;NOCOLOR: subq  $264, %rsp
> +;NOCOLOR: subq  $272, %rsp
> +;NOCOLOR: subq  $400, %rsp
> +;NOCOLOR: subq  $400, %rsp
>
> Can you put the filecheck assertions inside the functions in question, and
> add anchoring assertions for the function name? That makes the test easier
> for me to read, debug, and update.
>
> Index: include/llvm/CodeGen/ISDOpcodes.h
> ===================================================================
> --- include/llvm/CodeGen/ISDOpcodes.h (revision 162761)
> +++ include/llvm/CodeGen/ISDOpcodes.h (working copy)
> @@ -637,6 +637,10 @@
>      ATOMIC_LOAD_UMIN,
>      ATOMIC_LOAD_UMAX,
>
> +    /// This corresponds to the LIFETIME intrinsics. The first operand
>
> I would write this ".. to the @llvm.lifetime intrinsics. ..." or otherwise
> differ from the style of the ISD node enumerators...
>
> +    /// is the chain and the second operand is the alloca pointer.
> +    LIFETIME_START, LIFETIME_END,
> +
>
>
> Index: lib/CodeGen/StackColoring.cpp
> ===================================================================
> --- lib/CodeGen/StackColoring.cpp (revision 0)
> +++ lib/CodeGen/StackColoring.cpp (revision 0)
> @@ -0,0 +1,593 @@
> +//===-- StackColoring.cpp --------------------------------------------
> ----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This pass implements the stack-coloring optimization that looks for
> +// lifetime markers machine instructions (LIFESTART_BEGIN and
> LIFESTART_END),
> +// which represent the possible lifetime of stack slots. It attempts to
> +// merge disjoint stack slots and reduce the used stack space.
> +// NOTE: This pass is not StackSlotColoring, which optimizes spill slots.
>
> Can you add some notes here about the future work? Specifically as it
> relates to the other pass.
>
> Also, would it be possible to rename one or the other pass to make the
> differences between them more obvious to the reader, and not require this
> note? (clearly as a follow-up or preparatory commit)
>
> +namespace {
> +class StackColoring : public MachineFunctionPass {
>
> Doxygen comments here and throughout please.
>
> +  MachineFrameInfo *MFI;
> +  MachineFunction *MF;
> +
> +  typedef DenseMap<MachineBasicBlock*, BitVector> SlotsPerBlock;
> +
> +  // Holds a list of active slots per basic block.
> +  SlotsPerBlock BEGIN;
> +  SlotsPerBlock END;
> +  SlotsPerBlock LIVE_IN;
> +  SlotsPerBlock LIVE_OUT;
>
> These should definitely have very explanatory comments. Also, as this is
> new code, please follow the coding convention for naming and make these
> CamelCase.
>
> I'm surprised at having four separate dense maps here. What's the
> rationale behind that? My (admittedly naive, so your explanation may be all
> that I need) base line assumption would be that this would either call for
> a single map from the block pointer to a struct of four BitVectors, or that
> a better solution would be to put both a flag (indicating begin, end,
> live_in, and live_out) and the block pointer in the key and have a single
> map. The former seems most appropriate if the expectation is that
> essentially every block will have all four of these, and that they are all
> four accessed equally frequently. The latter would seem best if often not
> all four of these are present for any particular block. What you have only
> really makes sense to me if we always have all four, but one of these is
> accessed *much* more often than the others, and thus we want greater
> locality for that densemap? Anyways, a comment about this would help a lot
> I suspect, regardless of the final representation.
>
> Also, how large are these bitvectors? I worry a bit about how happily
> DenseMap will copy the values.
>
> +
> +  // Mapping between basic blocks and a serial number.
> +  DenseMap<MachineBasicBlock*, int> BasicBlocks;
> +  SmallVector<MachineBasicBlock*, 8> BasicBlockNumbering;
> +
> +  // The intervals to describe the active ranges for each slot.
> +  SmallVector<LiveInterval*, 16> Intervals;
> +  VNInfo::Allocator VNInfoAllocator;
> +  SlotIndexes* Indexes;
> +
> +  // The list of lifetime markers found.
> +  SmallVector<MachineInstr*, 8> Markers;
> +
> +  struct SlotSizeSorter {
>
> It's not at all clear to me how this sorter is used... doxygen comments
> will likely help here.
>
> +    MachineFrameInfo *MFI;
> +    SlotSizeSorter(MachineFrameInfo *M) : MFI(M) { }
>
> Prevailing style uses the exact same name as the member for the
> initializing parameter.
>
> +    bool operator()(int LHS, int RHS) {
> +      if (LHS == -1) return 0;
> +      if (RHS == -1) return 1;
>
> 0 and 1? How about false and true?
>
> Is this still a SWO? Should these just be asserts?
>
> +      size_t LSize = MFI->getObjectSize(LHS);
> +      size_t RSize = MFI->getObjectSize(RHS);
> +      return LSize > RSize;
>
> Why the variables?
>
> +  }
> +};
>
> The indent for these two curlies is wrong.
>
> +private:
> +  /// Debug.
> +  void dump();
> +
> +  /// Removes all of the lifetime marker instructions from the function.
>
> How do you feel about putting all of the doxygen comments at the
> implementation site since this isn't part of the interface, and just a
> private helper function?
>
> +  bool RemoveAllMarkers();
>
> Please use the new convention (camelCase) for method names throughout.
>
> +
> +  /// Scan the machine function and find all of the lifetime markers.
> +  /// Record the findings in the BEGIN and END vectors.
>
> We don't have BEGIN and END vectors, we have dense maps of bitvectors?
> Maybe this just needs more clarification in the datastructure represented
> by those vectors.
>
> +  /// returns the number of markers found.
>
> Use doxygen \returns?
>
> +  unsigned CollectMarkers(unsigned NumSlot);
> +
> +  /// Perform the dataflow calculation and calculate the lifetime for
> each of
> +  /// the slots, based on the BEGIN/END vectors. Set the LIVE_IN and
> LIVE_OUT
> +  /// vectors that represent which stack slots are live coming in and out
> +  // blocks.
> +  void CalculateLocalLiveness();
> +
> +  /// Construct the LiveIntervals for the slots.
> +  void CalculateLiveIntervals(unsigned NumSlots);
> +
> +  /// Go over the machine function and change instructions which use stack
> +  /// slots to use the joint slots.
> +  void RemapInstructions(DenseMap<int, int> &SlotRemap);
> +
> +  /// Map entries which point to other entries to their destination.
> +  ///   A->B->C becomes A->C.
> +   void ExpungeSlotMap(DenseMap<int, int> &SlotRemap, unsigned NumSlots);
> +};
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120904/07c889a5/attachment.html>


More information about the llvm-commits mailing list