[llvm] [AMDGPU] Fix Xcnt handling between blocks (PR #165201)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 3 03:11:29 PST 2025
================
@@ -1288,18 +1288,38 @@ void WaitcntBrackets::applyWaitcnt(InstCounterType T, unsigned Count) {
}
void WaitcntBrackets::applyXcnt(const AMDGPU::Waitcnt &Wait) {
+ // On entry to a block with multiple predescessors, there may
+ // be pending SMEM and VMEM events active at the same time.
+ // In such cases, only clear one active event at a time.
+ auto applyPendingXcntGroup = [this](unsigned E) {
+ unsigned LowerBound = getScoreLB(X_CNT);
+ applyWaitcnt(X_CNT, 0);
+ PendingEvents |= (1 << E);
+ setScoreLB(X_CNT, LowerBound);
----------------
jayfoad wrote:
This seems like a very complicated way to write `PendingEvents &= ~(1 << E)` (where E is the _other_ SMEM/VMEM event type). Can you simplify and inline this?
https://github.com/llvm/llvm-project/pull/165201
More information about the llvm-commits
mailing list