[PATCH] D30442: [AMDGPU] Add second pass of the scheduler

Tue Feb 28 11:10:53 PST 2017

rampitec added a comment.

In https://reviews.llvm.org/D30442#688792, @arsenm wrote:

> How expensive is it to do this? The scheduler is already frequently the most expensive pass after RA, sometimes surpassing it

The algorithm is not very expensive itself. Liveins scanned for all defined registers once every region. This is probably most expensive part if there are a lot of registers. The main scan after that only touches those registers which are alive in the region. This is obviously more expensive than without it, but not terribly expensive given ready LIS we already have.

That would be possible to preserve LiveRegs between regions, but main scheduler loop can skip some regions.

================
Comment at: lib/Target/AMDGPU/GCNSchedStrategy.cpp:62
     ->getNumAllocatableRegs(&AMDGPU::VGPR_32RegClass) - ErrorMargin;
-  SGPRCriticalLimit = SRI->getRegPressureSetLimit(DAG->MF,
-                        SRI->getSGPRPressureSet()) - ErrorMargin;
-  VGPRCriticalLimit = SRI->getRegPressureSetLimit(DAG->MF,
-                        SRI->getVGPRPressureSet()) - ErrorMargin;
+  if (TargetOccupancy) {
+    SGPRCriticalLimit = ST.getMaxNumSGPRs(TargetOccupancy, true);
----------------
kzhuravl wrote:
> I think this should also respect the "amdgpu-waves-per-eu" attribute (https://clang.llvm.org/docs/AttributeReference.html#amdgpu-waves-per-eu)?
It does when it calls getRegPressureSetLimit(). However if we are not limited or cannot keep within guessed optimistic limits we override TargetOccupancy and reschedule.

Repository:
  rL LLVM

https://reviews.llvm.org/D30442