[llvm-branch-commits] [llvm] [AMDGPU] Add HWUI pressure heuristics to coexec strategy (PR #184929)
Austin Kerbow via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Mar 9 22:52:30 PDT 2026
================
@@ -41,6 +41,370 @@ static SUnit *pickOnlyChoice(SchedBoundary &Zone) {
return OnlyChoice;
}
+InstructionFlavor llvm::classifyFlavor(const MachineInstr *MI,
+ const SIInstrInfo *SII) {
+ if (!MI || MI->isDebugInstr())
+ return InstructionFlavor::Other;
+
+ unsigned Opc = MI->getOpcode();
+
+ // Check for specific opcodes first.
+ if (Opc == AMDGPU::ATOMIC_FENCE || Opc == AMDGPU::S_WAIT_ASYNCCNT ||
+ Opc == AMDGPU::S_WAIT_TENSORCNT || Opc == AMDGPU::S_BARRIER_WAIT ||
+ Opc == AMDGPU::S_BARRIER_SIGNAL_IMM)
+ return InstructionFlavor::Fence;
+
+ if (Opc == AMDGPU::TENSOR_LOAD_TO_LDS_D2 ||
+ Opc == AMDGPU::TENSOR_LOAD_TO_LDS ||
+ Opc == AMDGPU::GLOBAL_LOAD_ASYNC_TO_LDS_B32 ||
+ Opc == AMDGPU::GLOBAL_LOAD_ASYNC_TO_LDS_B32_SADDR)
+ return InstructionFlavor::DMA;
+
+ if (SII->isMFMAorWMMA(*MI))
+ return InstructionFlavor::WMMA;
+
+ if (SII->isTRANS(*MI))
+ return InstructionFlavor::TRANS;
+
+ if (SII->isVALU(*MI))
+ return InstructionFlavor::SingleCycleVALU;
----------------
kerbowa wrote:
The SchedModel is mostly capturing dependent latency rather than repeat rate which is what single cycle refers to in this example.
It's something we need to consider carefully, but downstream we've moved away from relying on the SchedModel exclusively since it cannot effectively model context dependent structural stalls. I think it will become more clear in subsequent PRs.
https://github.com/llvm/llvm-project/pull/184929
More information about the llvm-branch-commits
mailing list