[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

Wed Jul 24 14:39:46 PDT 2024

================
@@ -208,11 +212,33 @@ class StaleMatcher {
     }
   }
 
-  /// Find the most similar block for a given hash.
-  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash,
-                              uint64_t CallHash) const {
+  /// Creates a mapping from a pseudo probe index to pseudo probe.
+  void mapIndexToProbe(uint64_t Index, const MCDecodedPseudoProbe *Probe) {
+    IndexToBBPseudoProbes[Index].push_back(Probe);
+  }
+
+  /// Creates a mapping from a pseudo probe to a flow block.
+  void mapProbeToBB(const MCDecodedPseudoProbe *Probe, FlowBlock *Block) {
+    BBPseudoProbeToBlock[Probe] = Block;
+  }
+
+  /// Find the most similar flow block for a profile block given its hashes and
+  /// pseudo probe information.
+  const FlowBlock *
+  matchBlock(BlendedBlockHash BlendedHash, uint64_t CallHash,
+             const std::vector<yaml::bolt::PseudoProbeInfo> &PseudoProbes) {
     const FlowBlock *BestBlock = matchWithOpcodes(BlendedHash);
-    return BestBlock ? BestBlock : matchWithCalls(BlendedHash, CallHash);
+    if (BestBlock) {
+      ++MatchedWithOpcodes;
+      return BestBlock;
+    }
+    BestBlock = matchWithCalls(BlendedHash, CallHash);
+    if (BestBlock)
+      return BestBlock;
+    BestBlock = matchWithPseudoProbes(BlendedHash, PseudoProbes);
----------------
WenleiHe wrote:

> Exact hash provides a stronger signal than probes and is available for all binary basic blocks, so it makes sense to put it to the top.

Not necessarily. The tricky bit is for better overall graph matching, strict exact match for a few individual block does't guarantee global best match. We need to capture information that is relevant to CFG change, capture everything in a hash (strict) could also cause matching to be skewed by certain changes that can be considered noise for our purpose. 

Btw, I'm not asking for an immediate change here, but at least something we need to think about, and experiment. In an alternative design, we can have several matching strategy, controlled by a switch, and we can then experiment with them using the same profile. The selection of strategy can be global instead of on per-block basis, at least for a subset of the strategies. 

https://github.com/llvm/llvm-project/pull/99891