[llvm] [AMDGPU] Extend DS loop wait optimization with flush point tracking (PR #175658)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 19 23:46:20 PST 2026
================
@@ -3382,6 +3384,17 @@ SIInsertWaitcnts::getPreheaderFlushFlags(MachineLoop *ML,
DenseSet<MCRegUnit> VgprDefVMEM;
DenseSet<MCRegUnit> VgprDefDS;
+ // Track DS loads for prefetch pattern with flush points (single-block only).
+ // Keeps track of the last DS load (position counted from the top of the loop)
+ // to each VGPR. Load is considered consumed (and thus needs flushing) if
+ // the loaded register has a use or is overwritten (by any later opertions).
+ DenseMap<MCRegUnit, unsigned> DSLoadPosition;
+ bool IsSingleBlock = ML->getNumBlocks() == 1;
+ bool FlushPointTrackingInvalid =
+ !ST->hasExtendedWaitCounts() || !IsSingleBlock;
+ unsigned DSLoadCount = 0;
----------------
vporpo wrote:
So this is used as the "position" of the DSLoad, so perhaps rename it to something that includes the word "position" (just like LastFlushedPosition), to make naming more consistent?
https://github.com/llvm/llvm-project/pull/175658
More information about the llvm-commits
mailing list