[llvm] r279995 - AMDGPU/SI: Implement a custom MachineSchedStrategy

Fri Sep 2 11:37:11 PDT 2016

> On Sep 2, 2016, at 11:15, Hal Finkel via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> Are they so much more expensive than the scalar spills that it is not even worth modeling the cost, or is something you might want to model more carefully in the future?

It’s more complicated than this because there are two entirely different spill paths for SGPR spilling with very different costs. They can be much cheaper or much more expensive depending on how VGPRs have been allocated at the time the SGPR spill happens. Ideally they spill to a VGPR, but if those run out (globally through the program), they have to spill to memory. The spill to register is pretty cheap, but the basic spill to memory is just as expensive and has an additional step. With LLVM’s register allocator now we can’t cleanly always spill to a VGPR because both register classes are allocated at the same time. I have a hack I’m working on to make the spill to memory for SGPRs correct, which requires performing the same memory spill twice to be correct in all cases which is less than ideal. Newer hardware has additional store instructions which mostly avoids this problem, but being able to delay allocation of VGPRs until a second RA run I think is necessary to make the cost more understandable and fix various other issues

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160902/43eb5467/attachment.html>