[llvm] [AMDGPU] Add DS loop wait optimization infrastructure (1/4) (PR #171942)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Sat Dec 13 02:51:36 PST 2025


================
@@ -2786,6 +2892,22 @@ bool SIInsertWaitcnts::run(MachineFunction &MF) {
   assert(NumSGPRsMax <= SQ_MAX_PGM_SGPRS);
 
   BlockInfos.clear();
+  LoopDSWaitOptCache.clear();
+
+  // Analyze single-block loops for DS wait optimization (GFX12+)
+  if (OptimizeDSLoopWaitcnt && ST->hasExtendedWaitCounts()) {
+    SmallVector<MachineLoop *, 4> Worklist(MLI->begin(), MLI->end());
+    while (!Worklist.empty()) {
+      MachineLoop *ML = Worklist.pop_back_val();
+      auto BeginIt = ML->getSubLoops().begin();
+      auto EndIt = ML->getSubLoops().end();
+      if (BeginIt == EndIt) // innermost loop only
----------------
arsenm wrote:

```suggestion
      if (BeginIt == EndIt) // Innermost loop only.
```

https://github.com/llvm/llvm-project/pull/171942


More information about the llvm-commits mailing list