[PATCH] D154480: [AMDGPU] Flush vmcnt with any loop extraneous defs

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 5 10:14:43 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1770
+  // loop.
+  DenseSet<Register> LoopExtraneousDefs;
 
----------------
How does this set distinguish sub and full register dfes?


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1805
     return true;
-  return HasVMemLoad && UsesVgprLoadedOutside;
+  return LoopExtraneousDefs.size() && UsesVgprLoadedOutside;
 }
----------------
.empty()?


================
Comment at: llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir:788
+# GFX9-LABEL: bb.1:
+# GFX9-NOT: S_WAITCNT 39
+# GFX9-LABEL: bb.2:
----------------
-NEXT is much better than -NOT


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154480/new/

https://reviews.llvm.org/D154480



More information about the llvm-commits mailing list