[PATCH] D134526: [AMDGPU] Preserve only the inactive lanes of scratch vgprs
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 23 07:43:24 PDT 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:866
+ MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+ const DebugLoc &DL, bool IsProlog, bool IsCalleeSavedReg) {
Register ScratchExecCopy;
----------------
I'd rename IsCalleeSavedReg to EnableInactiveLanes
================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:1096
Register ScratchExecCopy;
+ SmallVector<std::pair<Register, int>, 2> WWMCalleeSavedRegs, WWMScratchRegs;
----------------
Needs a comment explaining we're going to get 2 exec flip and restore blocks
================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:291
+ for (auto &Reg : WWMSpills) {
+ if (TRI->isCalleeSavedPhysReg(Reg.first, MF))
+ CalleeSavedRegs.push_back(Reg);
----------------
This needs to check the MRI callee saved regs, not TRI directly for dynamically changed CSRs
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/image-waterfall-loop-O0.ll:148
; CHECK-NEXT: v_mov_b32_e32 v3, s4
-; CHECK-NEXT: s_or_saveexec_b32 s4, -1
+; CHECK-NEXT: s_xor_saveexec_b32 s4, -1
; CHECK-NEXT: buffer_load_dword v8, off, s[0:3], s32 offset:44 ; 4-byte Folded Reload
----------------
In the case of lane exit we'll end up restoring values that weren't preserved, but I guess that's OK since those lanes are exiting as dead anyway
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134526/new/
https://reviews.llvm.org/D134526
More information about the llvm-commits
mailing list