[PATCH] D124195: [AMDGPU] Separate out custom SGPR spills to VGPR during PEI

Fri Oct 28 07:54:42 PDT 2022

nhaehnle added a comment.

Almost everything surrounding `allocateSGPRSpillToVGPR` and its companion methods is horrible from a coding style perspective. Now, a lot of this horribleness is pre-existing; that said, can you please get it while you're working on this anyway? Couple of things come to mind:

- allocateVGPRForSGPRSpills and allocateVGPRForCustomSGPRSpills are very specialized and interact in uncomfortable ways with ambient state. They absolutely must not be `public`.
- They are both basically the same method, and their existence only makes sense in the context of the basically identically named `allocateSGPRSpillToVGPR`. Please just merge those methods and inline them, so only `allocateSGPRSpillToVGPR` remains.
- How about renaming `allocateSGPRSpillToVGPR` to `allocateSGPRSpillToVGPRLanes`? This accounts for the fact that an SGPR spill is neither allocated to an entire VGPR, nor is an SGPR spill necessarily allocated to a single VGPR (it could cross multiple VGPRs depending on how the lane allocation works out)
- Relying on `WWMSpills.back()` to get the most recently used spill VGPR makes me rather uncomfortable. Since there's persistent tracking of allocated lanes already in the form of `NumVGPR[Custom]SpillLanes`, please just track the currently "open" VGPR explicitly as well.
- A lot of data is tied to the "custom / non-custom" distinction (which does need a better name). How about defining a struct `SIMachineFunctionInfo::SGPRToVGPRSpills` and moving all the related data in there (FrameIndex to (VGPR,lane) map, next (VGPR, lane) pair)? That will make the case distinction flow more nicely.

================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:476-479
   for (auto &R : make_early_inc_range(SGPRToVGPRSpills)) {
-    if (R.first != FramePointerSaveIndex && R.first != BasePointerSaveIndex) {
-      MFI.RemoveStackObject(R.first);
-      SGPRToVGPRSpills.erase(R.first);
-    }
+    MFI.RemoveStackObject(R.first);
+    SGPRToVGPRSpills.erase(R.first);
   }
----------------
Simplify this further to a simpler for loop and finally `SGPRToVGPRSpills.clear()`

================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:465
 private:
-  // Track VGPR + wave index for each subregister of the SGPR spilled to
-  // frameindex key.
+  // To track VGPR + wave index for each subregister of the SGPR spilled to
+  // frameindex key during SILowerSGPRSpills pass.
----------------
s/wave index/lane index/?

================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:465
   DenseMap<int, std::vector<SIRegisterInfo::SpilledReg>> SGPRToVGPRSpills;
+  DenseMap<int, std::vector<SIRegisterInfo::SpilledReg>> SGPRToVGPRCustomSpills;
   unsigned NumVGPRSpillLanes = 0;
----------------
arsenm wrote:
> Needs to document what "custom" means. Also the fact that it's not serialized makes me nervous. However, that should be OK since this is only set and read in PEI so it should be OK. Ideally we would have somewhere else to put it
Agree that "custom" is a bad name. It seems to be for prolog/epilog purposes, name it accordingly.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124195/new/

https://reviews.llvm.org/D124195