[PATCH] D114652: [AMDGPU] Treat call clobbered return address registers s[30:31] as callee saved

Venkata Ramanaiah Nalamothu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 26 20:42:44 PST 2021


RamNalamothu added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:94-95
       MachineInstrSpan MIS(I, &SaveBlock);
-      const TargetRegisterClass *RC =
-        TRI->getMinimalPhysRegClass(Reg, MVT::i32);
+      const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(
+          Reg, Reg == RI->getReturnAddressReg(MF) ? MVT::i64 : MVT::i32);
 
----------------
sebastian-ne wrote:
> Can the MVT be deduced from `getRegSizeInBits(*getPhysRegClass(Reg))`? I think that would look cleaner.
> Can the MVT be deduced from `getRegSizeInBits(*getPhysRegClass(Reg))`? I think that would look cleaner.

Will check, thank you.


================
Comment at: llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp:232
+    Register Reg = TRI->getReturnAddressReg(MF);
+    if (!FuncInfo->isEntryFunction() && MFI.hasCalls()) {
+      const TargetRegisterClass *RC =
----------------
sebastian-ne wrote:
> Why is the return address only marked as callee-save if this function has calls? I assume it can’t it be overwritten if there are no calls because it is marked as reserved?
> 
> Coming back here after reading the tests, I think s[30:31] should also be marked as callee-save if there are no calls. It shouldn’t cause harm if there are no calls (the register is reserved anyway) and it should fix the tests.
> Why is the return address only marked as callee-save if this function has calls? I assume it can’t it be overwritten if there are no calls because it is marked as reserved?
> 

Yes.

> Coming back here after reading the tests, I think s[30:31] should also be marked as callee-save if there are no calls. It shouldn’t cause harm if there are no calls (the register is reserved anyway) and it should fix the tests.

AFAIK that's the eventual goal and there is a discussion to pick the ABI registers for return address from the callee saved range. And these changes should work with no/minimal changes.

However, I don't think I understand how the tests will get fixed automatically if the return address registers are marked as callee saved because even the SGPR CSR save/restore happens during SGPRSpillLowering which is where I am manually adding s[30:31] to the CSR list.


================
Comment at: llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll:426
     ,~{s30},~{s31}"() #0
-
+  call void @local_empty_func()
   ret void
----------------
sebastian-ne wrote:
> I think these tests should continue to work without the need to introduce a call.
> I think these tests should continue to work without the need to introduce a call.

Thanks. I will take a re-look at it. I don't recall now why I added the call.


================
Comment at: llvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll:127
+; GCN:          v_readlane_b32 s29, v0, 1
+  call void asm sideeffect "; clobber", "~{s[28:29]}"() #0
   ret void
----------------
sebastian-ne wrote:
> I think this test purposefully clobbered the return address in s[30:31].
> I think this test purposefully clobbered the return address in s[30:31].

I believe the intent here is to test the behavior when non callee saved registers are clobbered inside inline asm but not particularly s[30:31] and I guess s[30:31] happened to be used since they are the boundary of non callee save range. The same goes for the other tests which were clobbering s[30:31].

As of now there is no policy on how the inline asm could use the ABI registers. I spoke to @arsenm and he is aware of this test changes.


================
Comment at: llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll:26-48
+; GFX9-NEXT:    s_or_saveexec_b64 s[34:35], -1
+; GFX9-NEXT:    buffer_store_dword v1, off, s[0:3], s32 ; 4-byte Folded Spill
+; GFX9-NEXT:    s_mov_b64 exec, s[34:35]
+; GFX9-NEXT:    v_writelane_b32 v1, s33, 2
+; GFX9-NEXT:    s_mov_b32 s33, s32
+; GFX9-NEXT:    s_addk_i32 s32, 0x400
+; GFX9-NEXT:    s_getpc_b64 s[34:35]
----------------
sebastian-ne wrote:
> This codes gets quite a bit worse. But I’m not sure if we can do anything about it.
> This codes gets quite a bit worse. But I’m not sure if we can do anything about it.

Yeah. This is a known scenario where previously the s[30:31] were being spilled to another SGPR pair, which was used to return, but now since the return address should always be in s[30:31] it has to be spilled into either VGPR or memory.

For now, we will have to live with it.

May be we could do something during SGPRSpillLowering to use the available SGPRs, not only for this case but in general?
I didn't check if that capability is already in place and s[30:31] is a corner case which is not handled.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114652/new/

https://reviews.llvm.org/D114652



More information about the llvm-commits mailing list