[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)

via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 1 05:03:16 PST 2023


================
@@ -1849,28 +1850,35 @@ bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
   BlockInfos.clear();
   bool Modified = false;
 
-  if (!MFI->isEntryFunction()) {
-    // Wait for any outstanding memory operations that the input registers may
-    // depend on. We can't track them and it's better to do the wait after the
-    // costly call sequence.
-
-    // TODO: Could insert earlier and schedule more liberally with operations
-    // that only use caller preserved registers.
-    MachineBasicBlock &EntryBB = MF.front();
-    MachineBasicBlock::iterator I = EntryBB.begin();
-    for (MachineBasicBlock::iterator E = EntryBB.end();
-         I != E && (I->isPHI() || I->isMetaInstruction()); ++I)
-      ;
-    BuildMI(EntryBB, I, DebugLoc(), TII->get(AMDGPU::S_WAITCNT)).addImm(0);
-
-    Modified = true;
-  }
-
   // Keep iterating over the blocks in reverse post order, inserting and
   // updating s_waitcnt where needed, until a fix point is reached.
   for (auto *MBB : ReversePostOrderTraversal<MachineFunction *>(&MF))
     BlockInfos.insert({MBB, BlockInfo()});
 
+  if (!MFI->isEntryFunction()) {
+    MachineBasicBlock &EntryBB = MF.front();
+    BlockInfo &EntryBI = BlockInfos.find(&EntryBB)->second;
+    EntryBI.Incoming = std::make_unique<WaitcntBrackets>(ST, Limits, Encoding);
+    WaitcntBrackets &Brackets = *EntryBI.Incoming;
+
+    unsigned UB = 0;
+    for (auto T : inst_counter_types())
+      UB = std::max(UB, Brackets.getWaitCountMax(T));
+
+    for (auto T : inst_counter_types())
+      Brackets.setScoreUB(T, UB);
----------------
rovka wrote:

Why do we need to use the max over all inst_counter_types, rather than setting the UB for each T with its own limit (i.e. `Brackets.setScoreUB(T, Brackets.getWaitCountMax(T))`?

https://github.com/llvm/llvm-project/pull/73122


More information about the llvm-commits mailing list