[llvm] [AMDGPU][SIInsertWaitCnts] Gfx12.5 - Refactor xcnt optimization (PR #164357)
Ryan Mitchell via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 4 12:04:11 PST 2025
================
@@ -2160,19 +2168,11 @@ bool SIInsertWaitcnts::generateWaitcnt(AMDGPU::Waitcnt Wait,
<< "Update Instr: " << *It);
}
- // XCnt may be already consumed by a load wait.
- if (Wait.XCnt != ~0u) {
- if (Wait.KmCnt == 0 && !ScoreBrackets.hasPendingEvent(SMEM_GROUP))
- Wait.XCnt = ~0u;
-
- if (Wait.LoadCnt == 0 && !ScoreBrackets.hasPendingEvent(VMEM_GROUP))
- Wait.XCnt = ~0u;
-
- // Since the translation for VMEM addresses occur in-order, we can skip the
- // XCnt if the current instruction is of VMEM type and has a memory
- // dependency with another VMEM instruction in flight.
- if (isVmemAccess(*It))
- Wait.XCnt = ~0u;
+ // Since the translation for VMEM addresses occur in-order, we can skip the
+ // XCnt if the current instruction is of VMEM type and has a memory
+ // dependency with another VMEM instruction in flight.
+ if (Wait.XCnt != ~0u && isVmemAccess(*It)) {
+ Wait.XCnt = ~0u;
----------------
RyanRio wrote:
Moved to generateWaitcntInstBefore, this aligns with current code much better
https://github.com/llvm/llvm-project/pull/164357
More information about the llvm-commits
mailing list