[llvm] [AMDGPU] Refine GCNHazardRecognizer hasHazard() (PR #138841)

Carl Ritson via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 12 05:32:59 PDT 2025


================
@@ -441,42 +441,96 @@ using IsExpiredFn = function_ref<bool(const MachineInstr &, int WaitStates)>;
 using GetNumWaitStatesFn = function_ref<unsigned int(const MachineInstr &)>;
 
 // Search for a hazard in a block and its predecessors.
+// StateT must implement getHashValue() and isEqual().
 template <typename StateT>
 static bool
-hasHazard(StateT State,
+hasHazard(StateT InitialState,
           function_ref<HazardFnResult(StateT &, const MachineInstr &)> IsHazard,
           function_ref<void(StateT &, const MachineInstr &)> UpdateState,
-          const MachineBasicBlock *MBB,
-          MachineBasicBlock::const_reverse_instr_iterator I,
-          DenseSet<const MachineBasicBlock *> &Visited) {
-  for (auto E = MBB->instr_rend(); I != E; ++I) {
-    // No need to look at parent BUNDLE instructions.
-    if (I->isBundle())
-      continue;
-
-    switch (IsHazard(State, *I)) {
-    case HazardFound:
-      return true;
-    case HazardExpired:
-      return false;
-    default:
-      // Continue search
-      break;
+          const MachineBasicBlock *InitialMBB,
+          MachineBasicBlock::const_reverse_instr_iterator InitialI) {
+  SmallVector<std::pair<const MachineBasicBlock *, unsigned>> Worklist;
+  SmallDenseSet<std::pair<const MachineBasicBlock *, unsigned>> Visited;
+  SmallVector<std::pair<unsigned, unsigned>, 1> Collisions;
+  SmallDenseMap<unsigned, unsigned> StateHash2Idx;
+  SmallVector<StateT> States;
+
+  // States contains a vector of unique state structures.
----------------
perlfu wrote:

I can put up a version that uses `deque` and `SmallDenseMap` from pointer to pointer if you prefer?
It doesn't look that different to this, but it is measurably a little slower.

A lot of the benefit of this refactor comes from the fact that most `hasHazard` calls can be handled within a block.
Hence avoiding touching any data structures and doing any memory allocation is beneficial -- most `std::deque` implementation do alloca on init AFAIK.

Beyond improved correctness the secondary goal of this patch is to try and claw back some of the impact of #138663.
Using `std::deque` certainly negates a lot of that.
Version prior to current was still the best.

https://github.com/llvm/llvm-project/pull/138841


More information about the llvm-commits mailing list