[PATCH] D130313: [AMDGPU] Avoid flushing the vmcnt counter in loop preheaders if not necessary
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 22 05:34:32 PDT 2022
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1733
// Vgpr use
if (Op.isUse()) {
for (int RegNo = Interval.first; RegNo < Interval.second; ++RegNo) {
----------------
Maybe use `.readsReg()` here? See comments about `undef` in the test case. (Note that readsReg is not mutually exclusive with isDef, since a subreg def operand both reads and writes the undeerlying super reg. So you might also need to remove the "else" on line 1752.)
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1742
// value is likely loaded outside of the loop.
- if (Brackets.getRegScore(RegNo, VM_CNT) > 0) {
+ unsigned Score = Brackets.getRegScore(RegNo, VM_CNT);
+ AMDGPU::Waitcnt Wait;
----------------
Maybe you just need: `if (Brackets.getRegScore(RegNo, VM_CNT) > Brackets.getScoreLB(VM_CNT))`?
================
Comment at: llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir:570
+
+ renamable $vgpr5 = BUFFER_LOAD_DWORD_OFFEN undef renamable $vgpr0, renamable $sgpr4_sgpr5_sgpr6_sgpr7, 0, 0, 0, 0, 0, implicit $exec
+ S_CBRANCH_SCC1 %bb.1, implicit killed $scc
----------------
It is weird to have `undef` on all the use operands in this test cause that means the instruction doesn't care about the value it reads - so really no waitcnt is required before any of those reads.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130313/new/
https://reviews.llvm.org/D130313
More information about the llvm-commits
mailing list