[LLVMdev] Greedy register allocation

Tue May 3 10:06:32 PDT 2011

On May 3, 2011, at 9:19 AM, David A. Greene wrote:

> Jakob Stoklund Olesen <stoklund at 2pi.dk> writes:
> 
>>    +10.0% SingleSource/Benchmarks/CoyoteBench/huffbench
>>    +12.0% SingleSource/Benchmarks/McGill/chomp
>>    +18.0% SingleSource/Benchmarks/BenchmarkGame/n-body
>>    +45.5% SingleSource/Benchmarks/BenchmarkGame/puzzle
>>    +10.0% SingleSource/Benchmarks/Shootout/heapsort
>>    +10.5% MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des
>>    +10.9% SingleSource/Benchmarks/Shootout-C++/heapsort
>>    +11.7% MultiSource/Benchmarks/Ptrdist/bc/bc
>>    +12.0% MultiSource/Benchmarks/McCat/17-bintr/bintr
>>    +55.2% SingleSource/Benchmarks/Shootout/methcall
> 
> Yikes!  Do we know why these codes got so much worse?  Even 5% is a big
> deal on x86.

On x86-64, n-body and puzzle have the exact same instructions as with linear scan. The only difference is the choice of registers. This causes some loops to be a few bytes longer or shorter which can easily change performance by that much if that small loop is all the benchmark does.

The greedy allocator is trying to pick registers so inner loops are as small as possible, but that is not always the right thing to do.

Unfortunately, we don't model the effects of code alignment, so there is a lot of luck involved.

I am working my way through the regressions, looking for things the allocator did wrong. Any help is appreciated, please file bugs if you find examples of stupid register allocation.

/jakob