[llvm-commits] Tuning LLVM Greedy Register Allocator to optimize for code size when targeting ARM Thumb 2 instruction set

Thu Feb 2 16:50:29 PST 2012

On 2012 1 31, at 14:39, Zino Benaissa <zinob at codeaurora.org> wrote:
> 
> The policy you described was designed for register coalescing.

That's only part of it. Spill weights play an important role. In fact, this is the only place spill weights are used. 

> Initially the
> priority is given to the hint over spill weight but then the eviction policy
> ensures hotter candidate are getting a register.  I am aware that my
> "luxury" eviction (interesting nomenclature :-)) is overriding the eviction
> policy. The reason is because it was designed differently: The way to look
> it is as a register trading (win-win) instead of an eviction (Win-lose). It
> works in two steps: 
> 
> First, Candidate gets a register (Note here my heuristic is silenced and
> policy is enforced). 
> 
> Second, if it gets a register and it happen to be a CostPerUse register
> (R8-R15), then try to trade this register with some other candidate's
> register (in this case, does it matter if this candidate has higher weight?
> Answer is no!)

It's not that simple. Live ranges are not interchangeable like that. Each has a unique shape.

I am sure you have seen this happening many times, but there will be cases where the evicted register doesn't fit in the hole you are leaving.

>> You should also make sure that the patch works for x86-64. There is a
> similar code size penalty to using r8-r15 and xmm8-15.
> 
> Before submitting, I have done the testing due diligence for x86-64 with the
> patch I am submitting (FYI, attached the test result).
> Yes, x86-64 could benefit from this framework to minimize REX prefix and
> optimize usage of old x86 registers. Unfortunately, I don't have the
> bandwidth to do the implementation. Of course, I will be happy to hear about
> and/or review the enabling work for X86-64.

When changing the core heuristic of the register allocator, due diligence is a week of benchmarking. Not running 'make check'.

I can't accept the patch as is, it's not done yet.

You are grafting a new heuristic onto the existing eviction policy. That is the kind of thing you do to run some experiments and try out new stuff. Clearly you are on to something here, but we still have work to do.

- We need to understand why this works. I don't understand it yet, and I am not sure you do either ;-) The eviction code is pretty hairy, and clarity is important. We can't commit code that improves some benchmarks, but nobody understands why. How would you maintain it?

- We need to properly integrate this new heuristic with the existing stuff. An alternative target dependent heuristic is not good enough for the target independent code generator. That is extra important in this part of RAGreedy that is already hard to follow. Part of this work is including x86-64 support.

- We need to benchmark and analyze regressions on x86 and arm.

You are changing a very critical part of the greedy register allocator here. It is not something you do lightly.

I hope you have time to work with me on this.

/jakob