[PATCH] D115401: AMDGPU: Fix clobbering SCC when expanding large offset spill pseudos

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 3 08:17:53 PST 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:189
 
+    TmpVGPRUsed = RS->isRegUsed(TmpVGPR);
+    if (TmpVGPRUsed) {
----------------
sebastian-ne wrote:
> What is the difference between `TmpVGPRUsed` and `TmpVGPRLive`? It looks like the same thing to me?
Not sure how I ended up with this, everything seems to work with it removed


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:228
       if (TmpVGPRLive)
-        TRI.buildVGPRSpillLoadStore(*this, TmpVGPRIndex, 0, /*IsLoad*/ false,
-                                    /*IsKill*/ false);
+        TRI.buildVGPRSpillLoadStore(*this, TmpVGPRIndex, 0, /*IsLoad*/ false);
       // Spill inactive lanes
----------------
sebastian-ne wrote:
> Isn’t TmpVGPR still used in the second `buildVGPRSpillLoadStore`? How does this work if it’s killed here?
It turns out this doesn't do anything, and is in the path that doesn't work. I also don't hit this in any existing test if I add unreachable


================
Comment at: llvm/test/CodeGen/AMDGPU/swdev309419-agpr.mir:1588-1593
+  ; GFX908-NEXT:   S_CMP_EQ_U32 0, 0, implicit-def $scc
+  ; GFX908-NEXT:   BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (store (s32) into %stack.2, addrspace 5)
+  ; GFX908-NEXT:   $vgpr0 = V_MOV_B32_e32 8200, implicit $exec
+  ; GFX908-NEXT:   $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec
+  ; GFX908-NEXT:   BUFFER_STORE_DWORD_OFFEN $vgpr0, $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (store (s32) into %stack.1, addrspace 5)
+  ; GFX908-NEXT:   $vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (load (s32) from %stack.2, addrspace 5)
----------------
sebastian-ne wrote:
> This looks incorrect. It’s trying to use v0 as offset and agpr intermediate at the same time.
Yes, this is broken. I spent most of the time on this patch trying to get this case to work. I thought I succeeded but I guess not. Given that we're going to be forced to reserve a scratch VGPR for dealing with AGPRs as part of D118415, I think I can fix this case by reusing the same register in a follow up patch.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115401/new/

https://reviews.llvm.org/D115401



More information about the llvm-commits mailing list