[PATCH] D31583: StackColoring: smarter check for slot overlap

Mon Apr 17 06:40:58 PDT 2017

dotdash added a comment.

Since I had a hand in changing the logic for the live interval calculation, I
feel like I should take some time to explain the approach that was taken here.

Originally, the stack coloring pass collected the lifetime markers and assumed
that there are at most two such markers per MBB. Either in the order start ->
end, creating a single segment in the live interval. Or in the order end ->
start, creating two segments in the live interval, the slot being dead in the
middle of the MBB.

Later the code was adjusted to handle more than two markers per block. Because
the collected markers were in a random order, instead of forming multiple
segments only a single segment was created and extended as needed.
Unfortunately this actually broke the logic for handling two markers in the
order end -> start. This is because the LiveIn and LiveOut bits would then also
be set, and the segment gets extended to cover the whole MBB.

By now, we're iterating over the instructions in the MBB in order anyway, so we
can actually handle multiple start/end markers properly and create multiple
segments in the live interval even within a single block.

The semantics are as follows:

A slot for which there are no lifetime markers is always live.

For slots that have lifetime markers:

In the entry MBB, the slot starts out as dead. In other MBBs, the slot starts
out as live if the dataflow analysis determined it to be live coming into this
block, otherwise it starts as dead.

Iterating over the instructions in the MBB in sequential order:

A dead slot becomes live when it encounters a START (which could be a use), and
stays live until it encounters an END. Any START encountered while a slot is
live has no effect. This is necessary, because the START could be a plain use
rather than a lifetime_start and may thus not shorten the live interval.

A live slot becomes dead when it encounters an END. At this point, the slots
live interval gets a new segment that starts at the index where the slot became
live, and ends at the index the END was encountered. Any END encountered while
a slot is dead has no effect.

Building upon that, the improvement this patch was initially motivated by is
described below.

Because a lifetime end on a dead slot has no effect, a frontend may choose to
combine certain code paths to produce fewer BBs. Consider this pseudocode:

  A = alloca TYPE
  B = alloca TYPE
  C = alloca TYPE_WITH_DTOR

  main:
    LT_START(C)

    if COND {
      LT_START(A)
      INVOKE func UNWIND cleanup_A
      LT_END(A)
    } else {
      LT_START(B)
      INVOKE func UNWIND cleanup_B
      LT_END(B)
    }

    DTOR(C)
    LT_END(C)

    RETURN

  cleanup_A:
    LP
    LT_END(A)
    br cleanup;

  cleanup_B:
    LP
    LT_END(B)
    br cleanup;

  cleanup:
    DTOR(C)
    LT_END(C)
    RESUME

Here we need two distinct cleanup paths and landing pads just to ensure that
the old stack coloring code can merge slots A and B. But assuming that a
lifetime end on a dead slot is a no-op, we could also write:

  A = alloca TYPE
  B = alloca TYPE
  C = alloca TYPE_WITH_DTOR

  main:
    LT_START(C)

    if COND {
      LT_START(A)
      INVOKE func UNWIND cleanup_A
      LT_END(A)
    } else {
      LT_START(B)
      INVOKE func UNWIND cleanup_B
      LT_END(B)
    }

    DTOR(C)
    LT_END(C)

    RETURN

  cleanup:
    LP
    LT_END(A)
    LT_END(B)
    DTOR(C)
    LT_END(C)
    RESUME

Now the problem is that both A and B are only "possibly live" coming into
"cleanup". "Possibly live" meaning that a slot is live on one but not
necessarily all incoming edges. Using a plain overlap check on the live
intervals created using this information causes false positives, stopping the
slots from being merged.

To solve this, we can use an alternative approach to check whether two slots
are live at the same time. For two slots to be live at the same time, one of
them needs to become live when the other is live as well. We can check this by
keeping track of the points at which a slot becomes live. As these points tell
us that a slot is "definitely" live, we get more accurate results.

I hope this explains the approach we've taken well enough.

https://reviews.llvm.org/D31583