[PATCH] D134526: [AMDGPU] Preserve only the inactive lanes of scratch vgprs

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 23 07:43:24 PDT 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:866
+                     MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+                     const DebugLoc &DL, bool IsProlog, bool IsCalleeSavedReg) {
   Register ScratchExecCopy;
----------------
I'd rename IsCalleeSavedReg to EnableInactiveLanes


================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:1096
 
   Register ScratchExecCopy;
+  SmallVector<std::pair<Register, int>, 2> WWMCalleeSavedRegs, WWMScratchRegs;
----------------
Needs a comment explaining we're going to get 2 exec flip and restore blocks


================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:291
+  for (auto &Reg : WWMSpills) {
+    if (TRI->isCalleeSavedPhysReg(Reg.first, MF))
+      CalleeSavedRegs.push_back(Reg);
----------------
This needs to check the MRI callee saved regs, not TRI directly for dynamically changed CSRs


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/image-waterfall-loop-O0.ll:148
 ; CHECK-NEXT:    v_mov_b32_e32 v3, s4
-; CHECK-NEXT:    s_or_saveexec_b32 s4, -1
+; CHECK-NEXT:    s_xor_saveexec_b32 s4, -1
 ; CHECK-NEXT:    buffer_load_dword v8, off, s[0:3], s32 offset:44 ; 4-byte Folded Reload
----------------
In the case of lane exit we'll end up restoring values that weren't preserved, but I guess that's OK since those lanes are exiting as dead anyway


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134526/new/

https://reviews.llvm.org/D134526



More information about the llvm-commits mailing list