[llvm] [AMDGPU] Refine GCNHazardRecognizer hasHazard() (PR #138841)

Tue Sep 9 02:24:46 PDT 2025

================
@@ -436,42 +436,55 @@ using IsExpiredFn = function_ref<bool(const MachineInstr &, int WaitStates)>;
 using GetNumWaitStatesFn = function_ref<unsigned int(const MachineInstr &)>;
 
 // Search for a hazard in a block and its predecessors.
+// StateT must implement getHashValue().
 template <typename StateT>
 static bool
-hasHazard(StateT State,
+hasHazard(StateT InitialState,
           function_ref<HazardFnResult(StateT &, const MachineInstr &)> IsHazard,
           function_ref<void(StateT &, const MachineInstr &)> UpdateState,
-          const MachineBasicBlock *MBB,
-          MachineBasicBlock::const_reverse_instr_iterator I,
-          DenseSet<const MachineBasicBlock *> &Visited) {
-  for (auto E = MBB->instr_rend(); I != E; ++I) {
-    // No need to look at parent BUNDLE instructions.
-    if (I->isBundle())
-      continue;
+          const MachineBasicBlock *InitialMBB,
+          MachineBasicBlock::const_reverse_instr_iterator InitialI) {
+  SmallVector<std::pair<const MachineBasicBlock *, StateT>> Worklist;
+  DenseSet<std::pair<const MachineBasicBlock *, unsigned>> Visited;
+  const MachineBasicBlock *MBB = InitialMBB;
+  StateT State = InitialState;
+  auto I = InitialI;
+
+  for (;;) {
+    bool Expired = false;
+    for (auto E = MBB->instr_rend(); I != E; ++I) {
+      // No need to look at parent BUNDLE instructions.
+      if (I->isBundle())
+        continue;
 
-    switch (IsHazard(State, *I)) {
-    case HazardFound:
-      return true;
-    case HazardExpired:
-      return false;
-    default:
-      // Continue search
-      break;
-    }
+      auto Result = IsHazard(State, *I);
+      if (Result == HazardFound)
+        return true;
+      if (Result == HazardExpired) {
+        Expired = true;
+        break;
+      }
 
-    if (I->isInlineAsm() || I->isMetaInstruction())
-      continue;
+      if (I->isInlineAsm() || I->isMetaInstruction())
+        continue;
 
-    UpdateState(State, *I);
-  }
+      UpdateState(State, *I);
+    }
 
-  for (MachineBasicBlock *Pred : MBB->predecessors()) {
-    if (!Visited.insert(Pred).second)
-      continue;
+    if (!Expired) {
+      unsigned StateHash = State.getHashValue();
+      for (MachineBasicBlock *Pred : MBB->predecessors()) {
+        if (!Visited.insert(std::pair(Pred, StateHash)).second)
----------------
perlfu wrote:

Take a fairly simple hazard like `GCNHazardRecognizer::fixVALUTransUseHazard`.
The state is only two counts `{VALUs, transcendental instructions}`.
How do you merge for example `{3, 0}` and `{2, 1}`?
The most pessimistic option is `{2, 0}` -- least likely for hazard to expire.
The optimistic option is `{3, 1}` -- but can miss hazards.
The next visited block might equally expire both of the existing states, but not the pessimistically merged one, potentially increasing the number of blocks search over simply testing each state or at least equally it.
This approach is also likely to yield false positives.
If we avoid duplicate visits for the same state, then convergence is possible.

In the above example I can propose a merge state, but with sets of registers, I do not think I reasonably can.
Hazard behaviour can be inherently different for each state.
Overall my position is that for correctness we *must* visit each MBB with each possible reachable state.
The concession is that we avoid duplicates visits for the same state, and my testing has already shown this provides a performance gain over the existing implementation

https://github.com/llvm/llvm-project/pull/138841