[PATCH] D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 1 05:02:38 PDT 2022
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:942
/// The "score bracket" is bound by the lower bound and upper bound
/// scores (*_score_LB and *_score_ub respectively).
+bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
----------------
Document the new FlushVmCnt argument here.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1629
+ if (Block.getFirstTerminator() == Block.end() &&
+ isPreheaderToFlush(Block, ScoreBrackets))
----------------
It is a shame that you have to implement this in two places, for blocks with and without terminators. I'm not sure if there is a better way. Maybe generateWaitcntInstBefore could be changed to take an iterator (which is allowed to be `end()`) instead of MI, so you would not need the new function generateWaitcntBlockEnd. But that would be quite invasive.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1649
+ // A preheader has at least one successor.
+ if (MBB.succ_begin() == MBB.succ_end())
+ return UpdateCache(false);
----------------
Could maybe simplify this by adding MachineBasicBlock::getUniqueSuccessor, like we have for IR BasicBlock. (I think a preheader has *exactly* one successor, correct?)
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1657-1658
+ return UpdateCache(false);
+ while (MachineLoop *Parent = Loop->getParentLoop())
+ Loop = Parent;
+
----------------
I don't quite understand the logic here. Why do you only check whether MBB is the preheader of the outermost loop? Could it not be the preheader of one of the inner loops?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115747/new/
https://reviews.llvm.org/D115747
More information about the llvm-commits
mailing list