[llvm] [AMDGPU] Add a trap lowering workaround for gfx11 (PR #85854)

Scott Linder via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 21 12:04:57 PDT 2024


================
@@ -2026,6 +2026,53 @@ void SIInstrInfo::insertReturn(MachineBasicBlock &MBB) const {
   }
 }
 
+MachineBasicBlock *SIInstrInfo::insertSimulatedTrap(MachineRegisterInfo &MRI,
+                                                    MachineBasicBlock &MBB,
+                                                    MachineInstr &MI,
+                                                    const DebugLoc &DL) const {
+  MachineFunction *MF = MBB.getParent();
+  MachineBasicBlock *SplitBB = MBB.splitAt(MI, /*UpdateLiveIns=*/false);
+  MachineBasicBlock *HaltLoop = MF->CreateMachineBasicBlock();
+  MF->push_back(HaltLoop);
+
+  // Start with a `s_trap 2`, if we're in PRIV=1 and we need the workaround this
+  // will be a nop.
+  BuildMI(MBB, MI, DL, get(AMDGPU::S_TRAP))
+      .addImm(static_cast<unsigned>(GCNSubtarget::TrapID::LLVMAMDHSATrap));
+  Register DoorbellReg = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+  BuildMI(MBB, MI, DL, get(AMDGPU::S_SENDMSG_RTN_B32), DoorbellReg)
+      .addImm(AMDGPU::SendMsg::ID_RTN_GET_DOORBELL);
+  BuildMI(MBB, MI, DL, get(AMDGPU::S_MOV_B32), AMDGPU::TTMP2)
+      .addUse(AMDGPU::M0);
+  Register And0x3ff = MRI.createVirtualRegister(&AMDGPU::SReg_32RegClass);
+  BuildMI(MBB, MI, DL, get(AMDGPU::S_AND_B32), And0x3ff)
+      .addUse(DoorbellReg)
+      .addImm(0x3ff);
----------------
slinder1 wrote:

Nit: it may not actually help much for the average reader, but the 0x3ff/0x400 masks/flags should probably be named constants somewhere

I think the fields being referenced are described in KFD at https://github.com/ROCm/ROCK-Kernel-Driver/blob/cf8e29ea7ba831dbafe9530dcdad8d5a5682c6d4/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c#L132 so you could reuse their naming convention

https://github.com/llvm/llvm-project/pull/85854


More information about the llvm-commits mailing list