[PATCH] D37163: [LICM] sink through non-trivially replicable PHI

Fri Sep 1 09:02:13 PDT 2017

dberlin added a comment.

In https://reviews.llvm.org/D37163#858781, @junbuml wrote:

> Of course, it would be good to avoid unnecessary splitting in the fist place if it's not very costly. Let me try to find a reasonable approach for this.

If you need help, let me know.
If you can't find one, awesome, let's go with splitting :)

> 
> 
>> The case where you can't place it somewhere safe should be the case where the edge is critical.
>> Otherwise, there should always be somewhere to place the computation safely and correctly.
> 
> I'm not perfectly clear about your above comment. Did you mean that we cannot safely sink a sinkable instruction through a critical edge? So, in such case we should split the critical edge to sink?

Yes.
Instructions are really placed on edges in most cases.
That is, you are hoisting or sinking it out of a block, and want it to appear first/last in another block.
This is really edge placement.
But most compilers don't allow code on edges, so you actually have to put it in the block.
In the normal, non-critical edge case, this is safe to do.  That is, it's possible to do it and have the instruction execute under only the same conditions it did before.

In the critical edge case, you may need to split the edge in order to have a place to put it.

Let me give you a hoisting example just because people seem to find it easier when i explain it that direction:

  a      b
  | \  /
  c  d

We have the same computation in B and D.
So you want to do PRE on it.
To eliminate the extra computation, you need to place a computation on the edge between A and D.   That is the only way to ensure that the computation is still available on all paths to D.

If you could place code on that edge, it would be easy, no muss, no fuss!
You'd be guaranteed it only executes on the a->d path.
LLVM (and most other compilers) can't place code on edges, only in the blocks.

But if you place the computation in block A, it may also be computed on the path to C, and when D is never executed.
That's no good, now your computation may execute when it didn't before (think of a load, for example, or any trapping instruction).

This happens because a->d is a critical edge. It's an edge from a block with multiple successors to a block with multiple predecessors.

The only safe way to place the computation on this edge is to split this edge so it looks like:

  a
  | \         b
  c   t     /
        \  /
          d

and then place the computation in t.

Sinking has the same problem (imagine you want to sink something from a to d.  if you place it at the beginning of d, now it also occurs on the b->d path, where it may never have occurred before. The only way to make it occur just on the a->d path is to split the edge as we did above)
The edges to loop exits may often be critical edges, so you may (but not always)  have to split them to place computations safely.
You can actually compute which edges are "blocking" your transformation and split them.
(you could also pre-split all critical edges, at some cost. There is a lot of literature on this, and the TL;DR is that you can prove that a number of dataflow problems do not get optimal answers in the presence of critical edges.  LLVM has not chosen this path so far, however, because the cost has outweighed the practical benefit).

https://reviews.llvm.org/D37163