[llvm] [AMDGPU] Add support for GFX12 expert scheduling mode 2 (PR #170319)

Mon Dec 8 07:15:57 PST 2025

================
@@ -1648,6 +1742,24 @@ bool WaitcntGeneratorGFX12Plus::applyPreexistingWaitcnt(
     // Merge consecutive waitcnt of the same type by erasing multiples.
     if (!*UpdatableInstr) {
       *UpdatableInstr = &II;
+    } else if (Opcode == AMDGPU::S_WAITCNT_DEPCTR) {
+      // S_WAITCNT_DEPCTR requires special care. Don't remove a
+      // duplicate if it is waiting on things other than VA_VDST or
+      // VM_VSRC. If that is the case, just make sure the VA_VDST and
+      // VM_VSRC subfields of the operand are set to the "no wait"
+      // values.
+
+      unsigned Enc = TII->getNamedOperand(II, AMDGPU::OpName::simm16)->getImm();
+      Enc = AMDGPU::DepCtr::encodeFieldVmVsrc(Enc, ~0u);
+      Enc = AMDGPU::DepCtr::encodeFieldVaVdst(Enc, ~0u);
+
+      if (Enc != 0xffff) {
----------------
stepthomas wrote:

Yes. As far as I can tell there isn't any proper way for `s_wait_alu` instructions to appear in the code that haven't been created by one of our passes, which means operand encodings should all be well-formed with any reserved bits in a defined state.

https://github.com/llvm/llvm-project/pull/170319