[LLVMdev] Greedy register allocation

Tue May 3 15:23:07 PDT 2011

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes:

>>> The greedy allocator is trying to pick registers so inner loops are as
>>> small as possible, but that is not always the right thing to do.
>> 
>> How does it balance that against spill cost?
>
> I added the CostPerUse field to the register descriptions. The
> allocator will try to minimize the spill weight assigned to registers
> with a CostPerUse. It does it by swapping physical register
> assignments, it won't do it if it requires extra spilling.

CostPerUse models the encoding size of the register?

> This is actually the cause of the n-body regression. The benchmark has nested loops:
>
> 	%vreg1 = const pool load
> header1:
> 	; large blocks with lots of floating point ops
> header2:
> 	; small loop using %vreg1
> 	jnz header2
> ...
> 	jnz header1
>

> The def of %vreg1 has been hoisted by LICM so it is live across a
> block with lots of floating point code. The allocator uses the low xmm
> registers for the large block, and %xmm8 is left for %vreg1 which has
> a low spill weight. This significantly improves code size, but the
> small loop suffers.

Why does %xmm8 have a low spill weight?  It's used in an inner loop.

> In this case it might have helped to split the live range and
> rematerialize, but usually that won't be the case.

That was my initial reaction.  Splitting should have at least
rematerialized the value just before header2.  That should significantly
improve things.  This is a classic motivational case for live range
splitting.

Another way to approach this is to add a register pressure heuristic to
LICM so it doesn't spill so much stuff out over such a large loop body.

                                   -Dave