[PATCH] D115401: AMDGPU: Fix clobbering SCC when expanding large offset spill pseudos
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 3 08:17:53 PST 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:189
+ TmpVGPRUsed = RS->isRegUsed(TmpVGPR);
+ if (TmpVGPRUsed) {
----------------
sebastian-ne wrote:
> What is the difference between `TmpVGPRUsed` and `TmpVGPRLive`? It looks like the same thing to me?
Not sure how I ended up with this, everything seems to work with it removed
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:228
if (TmpVGPRLive)
- TRI.buildVGPRSpillLoadStore(*this, TmpVGPRIndex, 0, /*IsLoad*/ false,
- /*IsKill*/ false);
+ TRI.buildVGPRSpillLoadStore(*this, TmpVGPRIndex, 0, /*IsLoad*/ false);
// Spill inactive lanes
----------------
sebastian-ne wrote:
> Isn’t TmpVGPR still used in the second `buildVGPRSpillLoadStore`? How does this work if it’s killed here?
It turns out this doesn't do anything, and is in the path that doesn't work. I also don't hit this in any existing test if I add unreachable
================
Comment at: llvm/test/CodeGen/AMDGPU/swdev309419-agpr.mir:1588-1593
+ ; GFX908-NEXT: S_CMP_EQ_U32 0, 0, implicit-def $scc
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (store (s32) into %stack.2, addrspace 5)
+ ; GFX908-NEXT: $vgpr0 = V_MOV_B32_e32 8200, implicit $exec
+ ; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFEN $vgpr0, $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (store (s32) into %stack.1, addrspace 5)
+ ; GFX908-NEXT: $vgpr0 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, implicit $exec :: (load (s32) from %stack.2, addrspace 5)
----------------
sebastian-ne wrote:
> This looks incorrect. It’s trying to use v0 as offset and agpr intermediate at the same time.
Yes, this is broken. I spent most of the time on this patch trying to get this case to work. I thought I succeeded but I guess not. Given that we're going to be forced to reserve a scratch VGPR for dealing with AGPRs as part of D118415, I think I can fix this case by reusing the same register in a follow up patch.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115401/new/
https://reviews.llvm.org/D115401
More information about the llvm-commits
mailing list