[PATCH] D96336: [AMDGPU] Save VGPR of whole wave when spilling

Tue Feb 9 08:36:33 PST 2021

sebastian-ne added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:1433
+    // lanes of the chosen VGPR. Pick v0 because it doesn't make a difference.
+    Register TmpVGPR = AMDGPU::VGPR0;
     RS->setRegUsed(TmpVGPR);
----------------
foad wrote:
> If the scavenger finds a vgpr that it thinks is dead, would that mean we only have to save the inactive lanes?
Yes, we can do that.

If you don’t mind, I’ll put that in a later patch. I’m currently preparing a patch on top of this one to get rid of the code duplication here.
That should also allow to get rid of the temporary RegScavenger below.

================
Comment at: llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll:49
 ; GFX6: s_add_u32 s32, s32, 0x[[OFFSET:[0-9a-f]+]]
+; GFX6-NEXT: s_waitcnt expcnt(0)
 ; GFX6-NEXT: buffer_load_dword v{{[0-9]+}}, off, s[{{[0-9:]+}}], s32
----------------
foad wrote:
> What causes this change?
Above these tested lines, the VGPR gets saved to scratch in a buffer_store_dword.
The same VGPR is the destination in buffer_load_dword below, so waiting for expcnt(0) makes sure we do not overwrite it before the store happened (the docs say expcnt waits until writes to the last level cache happened, so I guess the store→load is the reason).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96336/new/

https://reviews.llvm.org/D96336