[llvm] [AMDGPU] Add DS loop waitcnt optimization for GFX12+ (PR #172728)
Sameer Sahasrabuddhe via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 23 20:10:26 PST 2025
================
@@ -2715,51 +2732,105 @@ bool SIInsertWaitcnts::isVMEMOrFlatVMEM(const MachineInstr &MI) const {
return SIInstrInfo::isVMEM(MI);
}
-// Return true if it is better to flush the vmcnt counter in the preheader of
-// the given loop. We currently decide to flush in two situations:
+bool SIInsertWaitcnts::isDSRead(const MachineInstr &MI) const {
+ return SIInstrInfo::isDS(MI) && MI.mayLoad() && !MI.mayStore();
+}
+
+// Check if instruction may store to LDS (including DS stores, atomics,
+// FLAT instructions that may access LDS, and LDS DMA).
+bool SIInsertWaitcnts::mayStoreLDS(const MachineInstr &MI) const {
----------------
ssahasra wrote:
I think the intended meaning here is to say that "MI is an instruction that increments DS_CNT". This is a subset of instructions that actually write to LDS. For example, DMA operations do not increment DS_CNT. They are ordered using LOAD_CNT on pre-GFX12, and using ASYNC_CNT on GFX12-plus.
https://github.com/llvm/llvm-project/pull/172728
More information about the llvm-commits
mailing list