[llvm] [AMDGPU] Add support for GFX12 expert scheduling mode 2 (PR #170319)

Stephen Thomas via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 8 03:15:58 PST 2025


================
@@ -1648,6 +1742,24 @@ bool WaitcntGeneratorGFX12Plus::applyPreexistingWaitcnt(
     // Merge consecutive waitcnt of the same type by erasing multiples.
     if (!*UpdatableInstr) {
       *UpdatableInstr = ⅈ
+    } else if (Opcode == AMDGPU::S_WAITCNT_DEPCTR) {
+      // S_WAITCNT_DEPCTR requires special care. Don't remove a
+      // duplicate if it is waiting on things other than VA_VDST or
+      // VM_VSRC. If that is the case, just make sure the VA_VDST and
+      // VM_VSRC subfields of the operand are set to the "no wait"
+      // values.
+
+      unsigned Enc = TII->getNamedOperand(II, AMDGPU::OpName::simm16)->getImm();
+      Enc = AMDGPU::DepCtr::encodeFieldVmVsrc(Enc, ~0u);
+      Enc = AMDGPU::DepCtr::encodeFieldVaVdst(Enc, ~0u);
+
+      if (Enc != 0xffff) {
----------------
stepthomas wrote:

This is over-specific. What should happen is that we check if `(Enc & D) != D` where D=`AMDGPU::DepCtr::getDefaultDepCtrEncoding(*ST)`. Since the hard check against `0xffff` is used in several places, we might want to add a function named something like `AMDGPU::DepCtr::isWaitDepCtrNoop()`.

https://github.com/llvm/llvm-project/pull/170319


More information about the llvm-commits mailing list