[PATCH] D119475: [AMDGPU] Add scheduler pass to rematerialize trivial defs

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 10 15:11:35 PST 2022


rampitec added a comment.

One problem is that you are trying to handle multiple regions. You cannot estimate if an improvement in one region will actually improve an occupancy for the whole kernel, while actually sinking instructions. That's why I suggested at least to start with the scenario when you have only one region limiting the occupancy. I also hope it will simplify the logic in the sinkTriviallyRematInsts a lot, as I have to admit I am completely lost in it.



================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:717
+
+    if (!SRI->isVGPRClass(MRI.getRegClass(Reg)) || !MRI.hasOneUse(Reg))
+      continue;
----------------
Actually it does not matter if that is a VGPR. AGPRs also affect occupancy and SGPRs before gfx10 too.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:727
+
+      ++SeenSubRegDefs[Def.getOperand(0).getSubReg()];
+
----------------
If there is a subreg def you can bail immediately, it is not rematerializable anyway. If there is more than one def too for that matter.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:746
+
+      bool IsUndef = Def->getOperand(0).isUndef();
+      bool UndefInSameBlock = true;
----------------
All of that subreg machinery isn't really needed, we cannot rematerizlize subreg defs.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:772
+
+      RematerializableInsts.push_back(std::make_pair(Def, Pos));
+    }
----------------
You could immediately bail if use is not in a region with high pressure, yet in the beginning of this function. No need to even add such instruction to the list.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:791
+
+    RescheduleRegions[Idx] = true;
+    MachineBasicBlock *CurMBB = Regions[Idx].first->getParent();
----------------
You do not know it yet until you have actually rematerialized anything into this region.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:803
+    // First check if we have enough trivially rematerializable instructions
+    int BestRegDiff = VGPRUsage - (RematerializableInsts.size() +
+                                   TrivialRematDefsToSink.size());
----------------
This logic is incomplete. RematerializableInsts has to be filtered to only instructions relevant for the current region.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:933
+    Register Reg = Def->getOperand(0).getReg();
+    TII->reMaterialize(*InsertPos->getParent(), InsertPos, Reg,
+                       Def->getOperand(0).getSubReg(), *Def, *TRI);
----------------
Actually you cannot just rematerialize it without checking all uses are available. Check LiveRangeEdit::allUsesAvailableAt and LiveRangeEdit::canRematerializeAt, this is the checks you need to perform before adding an instruction to the list.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:957
+
+  return true;
+}
----------------
You need to update region begin interator somewhere. Your could rematerialize an instruction into the begin of the region.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:966
+
+  for (const MachineOperand &MO : MI.operands())
+    if (MO.isReg() && MO.isUse() && MO.getReg().isVirtual())
----------------
Use TII->isReallyTriviallyReMaterializable() instead.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:115
+  // Regions that has the same occupancy as the latest MinOccupancy
+  BitVector RegionsWithMinOcc;
+
----------------
Isn't it the same as RegionsWithHighRP?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119475/new/

https://reviews.llvm.org/D119475



More information about the llvm-commits mailing list