[PATCH] D49353: [RegAlloc] Skip global splitting if the live range is huge and its spill is trivially rematerializable

Mon Jul 16 15:27:18 PDT 2018

wmi added a comment.

In https://reviews.llvm.org/D49353#1164088, @MatzeB wrote:

> In https://reviews.llvm.org/D49353#1163951, @wmi wrote:
>
> > In https://reviews.llvm.org/D49353#1163840, @MatzeB wrote:
> >
> > > For the record: The check is based on a LiveInterval::size() which gives you the number of segments. So I assume what is "huge" here is the number of basic blocks?
> >
> >
> > One segment can span multiple basicblocks. I am not sure whether one basicblock can have multiple segments inside of it theoretically, but it is uncommon. So emperically large number of segments mean large number of basicblocks, then large number of edge bundle nodes and high hopfield neural network algorithm cost.
>
>
> Yes that's what I was getting at (just trying to understand the context); the description speaks about a live range being "huge", but just because you have a huge function or a huge basic block does not necessarily trigger this condition. I think right now with the connected component rule in place we can have no more than two segments of the same virtual register inside a basic block (a value living, and a value created possibly living out), any other situation must have been split up into multiple vregs previously. I had the same experience that the bad situations for register allocation compiletime (or copy coalescing for that matter) is automatically generated lexer or parser code with a big number of basic blocks. That usually leads to new value numbers being created at join points increasing the number of liverange segments...

Exactly as you are saying, copy coalescing is another problem. The patch here only solves the problem partially, and we are still facing the compile time problem of copy coalescing. A trivially rematerializable def instruction with thousands of uses is hoisted outside of a loop in machineLICM. That instruction is rematerialized in copy coalescing by thousands of times for each use, and the live interval update during each time of rematerialization is very costly because of the large live interval.

I wonder how much extra benefit we can get from machineLICM by hoisting so many trivially rematerializable instructions outside of a big loop. I believe most of loop invariant load/store or computation should already be hoisted during LICM phase, so here rematerializable instructions shouldn't enable many other loop invariant load/store/computations. But the extra compile time cost may be significant.

> Anyway so far this change looks fine to me, but I'm still waiting for our internal systems to come back with numbers on this change (will probably be 1 or 2 more days until everything has cycled through).

Repository:
  rL LLVM

https://reviews.llvm.org/D49353