[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)
via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 1 05:03:16 PST 2023
================
@@ -1849,28 +1850,35 @@ bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
BlockInfos.clear();
bool Modified = false;
- if (!MFI->isEntryFunction()) {
- // Wait for any outstanding memory operations that the input registers may
- // depend on. We can't track them and it's better to do the wait after the
- // costly call sequence.
-
- // TODO: Could insert earlier and schedule more liberally with operations
- // that only use caller preserved registers.
- MachineBasicBlock &EntryBB = MF.front();
- MachineBasicBlock::iterator I = EntryBB.begin();
- for (MachineBasicBlock::iterator E = EntryBB.end();
- I != E && (I->isPHI() || I->isMetaInstruction()); ++I)
- ;
- BuildMI(EntryBB, I, DebugLoc(), TII->get(AMDGPU::S_WAITCNT)).addImm(0);
-
- Modified = true;
- }
-
// Keep iterating over the blocks in reverse post order, inserting and
// updating s_waitcnt where needed, until a fix point is reached.
for (auto *MBB : ReversePostOrderTraversal<MachineFunction *>(&MF))
BlockInfos.insert({MBB, BlockInfo()});
+ if (!MFI->isEntryFunction()) {
+ MachineBasicBlock &EntryBB = MF.front();
+ BlockInfo &EntryBI = BlockInfos.find(&EntryBB)->second;
+ EntryBI.Incoming = std::make_unique<WaitcntBrackets>(ST, Limits, Encoding);
+ WaitcntBrackets &Brackets = *EntryBI.Incoming;
+
+ unsigned UB = 0;
+ for (auto T : inst_counter_types())
+ UB = std::max(UB, Brackets.getWaitCountMax(T));
+
+ for (auto T : inst_counter_types())
+ Brackets.setScoreUB(T, UB);
----------------
rovka wrote:
Why do we need to use the max over all inst_counter_types, rather than setting the UB for each T with its own limit (i.e. `Brackets.setScoreUB(T, Brackets.getWaitCountMax(T))`?
https://github.com/llvm/llvm-project/pull/73122
More information about the llvm-commits
mailing list