[llvm] [AMDGPU] Hoist WMMA coexecution hazard V_NOPs from loops to preheaders (PR #176895)

Dark Steve via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 20 08:19:44 PST 2026


================
@@ -2190,6 +2159,161 @@ int GCNHazardRecognizer::checkWMMACoexecutionHazards(MachineInstr *MI) {
   return WaitStatesNeeded;
 }
 
+void GCNHazardRecognizer::insertVnopsBeforeTerminator(MachineBasicBlock *MBB,
+                                                      int Count) {
+  MachineBasicBlock::iterator InsertPt = MBB->getFirstTerminator();
+  const DebugLoc &DL =
+      InsertPt != MBB->end() ? InsertPt->getDebugLoc() : DebugLoc();
+
+  for (int i = 0; i < Count; ++i) {
+    BuildMI(*MBB, InsertPt, DL, TII.get(AMDGPU::V_NOP_e32));
+  }
+}
+
+bool GCNHazardRecognizer::hasWMMAToWMMARegOverlap(
+    const MachineInstr &WMMA, const MachineInstr &MI) const {
+  Register D0 = TII.getNamedOperand(WMMA, AMDGPU::OpName::vdst)->getReg();
+  Register A1 = TII.getNamedOperand(MI, AMDGPU::OpName::src0)->getReg();
+  Register B1 = TII.getNamedOperand(MI, AMDGPU::OpName::src1)->getReg();
+
+  // WMMA0 writes (D0), WMMA1 reads (A1/B1/Idx1).
+  if (TRI.regsOverlap(D0, A1) || TRI.regsOverlap(D0, B1))
+    return true;
+
+  if (SIInstrInfo::isSWMMAC(MI)) {
----------------
PrasoonMishra wrote:

This isSWMMAC check is refactored from the original lambdas in checkWMMACoexecutionHazards() (added in PR #149865). 
SWMMAC has an extra operand (src2/index) that regular WMMA doesn't. We need to check this operand for hazards, so the isSWMMAC check is intentional.

https://github.com/llvm/llvm-project/pull/176895


More information about the llvm-commits mailing list