[llvm] [AMDGPU] Hoist WMMA coexecution hazard V_NOPs from loops to preheaders (PR #176895)
Dark Steve via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 20 08:19:44 PST 2026
================
@@ -2190,6 +2159,161 @@ int GCNHazardRecognizer::checkWMMACoexecutionHazards(MachineInstr *MI) {
return WaitStatesNeeded;
}
+void GCNHazardRecognizer::insertVnopsBeforeTerminator(MachineBasicBlock *MBB,
+ int Count) {
+ MachineBasicBlock::iterator InsertPt = MBB->getFirstTerminator();
+ const DebugLoc &DL =
+ InsertPt != MBB->end() ? InsertPt->getDebugLoc() : DebugLoc();
+
+ for (int i = 0; i < Count; ++i) {
+ BuildMI(*MBB, InsertPt, DL, TII.get(AMDGPU::V_NOP_e32));
+ }
+}
+
+bool GCNHazardRecognizer::hasWMMAToWMMARegOverlap(
+ const MachineInstr &WMMA, const MachineInstr &MI) const {
+ Register D0 = TII.getNamedOperand(WMMA, AMDGPU::OpName::vdst)->getReg();
+ Register A1 = TII.getNamedOperand(MI, AMDGPU::OpName::src0)->getReg();
+ Register B1 = TII.getNamedOperand(MI, AMDGPU::OpName::src1)->getReg();
+
+ // WMMA0 writes (D0), WMMA1 reads (A1/B1/Idx1).
+ if (TRI.regsOverlap(D0, A1) || TRI.regsOverlap(D0, B1))
+ return true;
+
+ if (SIInstrInfo::isSWMMAC(MI)) {
----------------
PrasoonMishra wrote:
This isSWMMAC check is refactored from the original lambdas in checkWMMACoexecutionHazards() (added in PR #149865).
SWMMAC has an extra operand (src2/index) that regular WMMA doesn't. We need to check this operand for hazards, so the isSWMMAC check is intentional.
https://github.com/llvm/llvm-project/pull/176895
More information about the llvm-commits
mailing list