[llvm] [AMDGPU] Fix Xcnt handling between blocks (PR #165201)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 3 03:11:29 PST 2025


================
@@ -1288,18 +1288,38 @@ void WaitcntBrackets::applyWaitcnt(InstCounterType T, unsigned Count) {
 }
 
 void WaitcntBrackets::applyXcnt(const AMDGPU::Waitcnt &Wait) {
+  // On entry to a block with multiple predescessors, there may
+  // be pending SMEM and VMEM events active at the same time.
+  // In such cases, only clear one active event at a time.
+  auto applyPendingXcntGroup = [this](unsigned E) {
+    unsigned LowerBound = getScoreLB(X_CNT);
+    applyWaitcnt(X_CNT, 0);
+    PendingEvents |= (1 << E);
+    setScoreLB(X_CNT, LowerBound);
----------------
jayfoad wrote:

This seems like a very complicated way to write `PendingEvents &= ~(1 << E)` (where E is the _other_ SMEM/VMEM event type). Can you simplify and inline this?

https://github.com/llvm/llvm-project/pull/165201


More information about the llvm-commits mailing list