[llvm] [AMDGPU] Add a trap lowering workaround for gfx11 (PR #85854)
Scott Linder via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 21 12:04:57 PDT 2024
================
@@ -2026,6 +2026,53 @@ void SIInstrInfo::insertReturn(MachineBasicBlock &MBB) const {
}
}
+MachineBasicBlock *SIInstrInfo::insertSimulatedTrap(MachineRegisterInfo &MRI,
+ MachineBasicBlock &MBB,
+ MachineInstr &MI,
+ const DebugLoc &DL) const {
+ MachineFunction *MF = MBB.getParent();
+ MachineBasicBlock *SplitBB = MBB.splitAt(MI, /*UpdateLiveIns=*/false);
+ MachineBasicBlock *HaltLoop = MF->CreateMachineBasicBlock();
+ MF->push_back(HaltLoop);
+
+ // Start with a `s_trap 2`, if we're in PRIV=1 and we need the workaround this
+ // will be a nop.
+ BuildMI(MBB, MI, DL, get(AMDGPU::S_TRAP))
+ .addImm(static_cast<unsigned>(GCNSubtarget::TrapID::LLVMAMDHSATrap));
+ Register DoorbellReg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+ BuildMI(MBB, MI, DL, get(AMDGPU::S_SENDMSG_RTN_B32), DoorbellReg)
+ .addImm(AMDGPU::SendMsg::ID_RTN_GET_DOORBELL);
+ BuildMI(MBB, MI, DL, get(AMDGPU::S_MOV_B32), AMDGPU::TTMP2)
+ .addUse(AMDGPU::M0);
+ Register And0x3ff = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+ BuildMI(MBB, MI, DL, get(AMDGPU::S_AND_B32), And0x3ff)
+ .addUse(DoorbellReg)
+ .addImm(0x3ff);
----------------
slinder1 wrote:
Nit: it may not actually help much for the average reader, but the 0x3ff/0x400 masks/flags should probably be named constants somewhere
I think the fields being referenced are described in KFD at https://github.com/ROCm/ROCK-Kernel-Driver/blob/cf8e29ea7ba831dbafe9530dcdad8d5a5682c6d4/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c#L132 so you could reuse their naming convention
https://github.com/llvm/llvm-project/pull/85854
More information about the llvm-commits
mailing list