[llvm] [AMDGPU] Extend DS loop wait optimization with flush point tracking (PR #175658)

via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 19 23:46:18 PST 2026


================
@@ -3382,6 +3384,17 @@ SIInsertWaitcnts::getPreheaderFlushFlags(MachineLoop *ML,
   DenseSet<MCRegUnit> VgprDefVMEM;
   DenseSet<MCRegUnit> VgprDefDS;
 
+  // Track DS loads for prefetch pattern with flush points (single-block only).
+  // Keeps track of the last DS load (position counted from the top of the loop)
+  // to each VGPR. Load is considered consumed (and thus needs flushing) if
+  // the loaded register has a use or is overwritten (by any later opertions).
+  DenseMap<MCRegUnit, unsigned> DSLoadPosition;
+  bool IsSingleBlock = ML->getNumBlocks() == 1;
+  bool FlushPointTrackingInvalid =
----------------
vporpo wrote:

nit: Since there are quite a few double negatives below, like !FlushPointTrackingInvalid, perhaps it would be better to use FlushPointTrackingValid instead ?

https://github.com/llvm/llvm-project/pull/175658


More information about the llvm-commits mailing list