[llvm-dev] Tweaking the Register Allocator's spill placement

Mon Jan 9 16:34:20 PST 2017

> On Jan 9, 2017, at 2:55 PM, Johnson, Nicholas Paul via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Hello,
> 
> My target features some very-high-latency instructions that access an on-chip network (we'll call them FXLV).  In one important kernel (snippet below), register allocation needs to spill values resulting from FXLV.  The spiller is unaware of FXLV's latency, and thus naively inserts those spills immediately after the FXLV, incurring huge and unnecessary data stalls.  
> 
>    FXLV r10, 0(r3.0)
>    SV r10, 0(r63.1) # spill to stack slot
>    FXLV r10, 16(r3.0)
>    SV r10, 16(r63.1) # spill to stack slot
>    FXLV r10, 32(r3.0)
>    SV r10, 32(r63.1) # spill to stack slot
>    FXLV r10, 48(r3.0)
>    SV r10, 48(r63.1) # spill to stack slot
>    ...
> Note also how the register allocator unfortunately re-uses register r10.  This prevents Post-RA scheduling from helping.
The old post-ra-scheduler-list was using the {Aggressive|Critical}AntiDepBreaker classes to improve this sort of situation. They are not used for the MachinePostScheduler (I can only guess but possibly out of frustration of bugs and hard to understand code in the *DepBreakers). You could experiment with using those.

- Matthias