[LLVMdev] Regalloc Refactoring

Mon Apr 16 15:43:00 PDT 2007

On Mon, 16 Apr 2007, David Greene wrote:
>> Yep, this is the one I was thinking of.  It is available online here:
>> http://www.cs.rice.edu/~keith/LACSI/pldi02.pdf
>
> I was just looking at this today.  One thing that strikes me about
> all these papers I've read on the topic is that no one seems to
> consider the interaction of coalescing with spilling.  By definition
> coalescing increases lifetimes and can cause more interferences.
> Yet the papers all try to coalesce as many copies as possible.

Funny you should mention it, we've been (okay, Evan has been ;-) fighting 
this issue lately.

> On, say, the x86 machines, the cost of a spill is really not much
> different from the cost of a copy so it's probably close to a wash in
> the end.  But there are many machines where the cost of each operation
> is vastly different.  Aggressive coalescing is not always good.  Often
> you'd rather copy than spill.

Even on X86, load are more expensive than copies (At least on sane 
implementations).

> Is anyone aware of publications addressing the interplay among
> coalescing, live range splitting, register allocation and spilling?
>
> This is one of the reasons I want to separate all of these concerns.
> We shouldn't force developers to always coalesce or always use some
> generic measure of spill cost.

In my mind (a dangerous place to go), it's not about coallescing vs 
spilling, it's about splitting vs spilling.

In particular, coallescing does increase live ranges of values, but you 
can have the same increase from code written without copies.  Basically, 
not doing aggressive copy coallescing means that you are dependent on the 
user telling you where live ranges should be split.  In an aggressive 
compiler, the user doesn't actually have control, as most of the explicit 
copies are gone by the time regalloc happens.  This means that you have an 
arbitrary set of copies that come from things like 2-addr elim, phi elim, 
extensions that don't generate code, etc.

Instead of doing this.  I think the right design is to:

1. Coallesce vregs together as aggressively as possible.  Build huge live 
ranges, delete as many copies as possible.
2. Start priority based register allocation.
3. For each live range without a register available, consider:
    1. live range splitting
    2. remat
    3. spilling

In particular, by doing live range splitting, the *register allocator* can 
control where the copies go, rather than relying on luck that previous 
passes introduce copies in convenient places.

In practice, right now we have no splitting capability.  In addition, we 
coallesce physregs with vregs, which can end up pinning a physreg across 
an entire function even if it has few uses.  This is all very bad, Evan is 
working on making the coallescer more careful about coallescing vregs with 
pregs now.

In the next 6 months or so, I'd like to start investigating a bigger 
refactoring of linear scan, which will hopefully allow us to build in some 
of the smarter techniques like aggressive remat, splitting, etc.  The 
first step to getting that is to make sure the datastructures (e.g. live 
intervals) are sane.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/