[PATCH] D154480: [AMDGPU] Flush vmcnt with any loop extraneous defs
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 5 10:14:43 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1770
+ // loop.
+ DenseSet<Register> LoopExtraneousDefs;
----------------
How does this set distinguish sub and full register dfes?
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1805
return true;
- return HasVMemLoad && UsesVgprLoadedOutside;
+ return LoopExtraneousDefs.size() && UsesVgprLoadedOutside;
}
----------------
.empty()?
================
Comment at: llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir:788
+# GFX9-LABEL: bb.1:
+# GFX9-NOT: S_WAITCNT 39
+# GFX9-LABEL: bb.2:
----------------
-NEXT is much better than -NOT
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154480/new/
https://reviews.llvm.org/D154480
More information about the llvm-commits
mailing list