[llvm] [SampleFDO] Improve stale profile matching by diff algorithm (PR #87375)

Fri May 10 10:53:51 PDT 2024

================
@@ -222,6 +260,57 @@ void SampleProfileMatcher::runStaleProfileMatching(
   }
 }
 
+// Call target name anchor based profile fuzzy matching.
+// Input:
+// For IR locations, the anchor is the callee name of direct callsite; For
+// profile locations, it's the call target name for BodySamples or inlinee's
+// profile name for CallsiteSamples.
+// Matching heuristic:
+// First match all the anchors using the diff algorithm, then split the
+// non-anchor locations between the two anchors evenly, first half are matched
+// based on the start anchor, second half are matched based on the end anchor.
+// For example, given:
+// IR locations:      [1, 2(foo), 3, 5, 6(bar), 7]
+// Profile locations: [1, 2, 3(foo), 4, 7, 8(bar), 9]
+// The matching gives:
+//   [1,    2(foo), 3,  5,  6(bar), 7]
+//    |     |       |   |     |     |
+//   [1, 2, 3(foo), 4,  7,  8(bar), 9]
+// The output mapping: [2->3, 3->4, 5->7, 6->8, 7->9].
+void SampleProfileMatcher::runStaleProfileMatching(
+    const Function &F, const AnchorMap &IRAnchors,
+    const AnchorMap &ProfileAnchors, LocToLocMap &IRToProfileLocationMap) {
+  LLVM_DEBUG(dbgs() << "Run stale profile matching for " << F.getName()
+                    << "\n");
+  assert(IRToProfileLocationMap.empty() &&
+         "Run stale profile matching only once per function");
+
+  AnchorList FilteredProfileAnchorList;
+  for (const auto &I : ProfileAnchors)
+    FilteredProfileAnchorList.emplace_back(I);
+
+  AnchorList FilteredIRAnchorsList;
+  // Filter the non-callsite from IRAnchors.
+  for (const auto &I : IRAnchors) {
+    if (I.second.stringRef().empty())
+      continue;
+    FilteredIRAnchorsList.emplace_back(I);
+  }
----------------
wlei-llvm wrote:

At the beginning we used separate two map for call and non-call, then we refined them into one to make it cleaner(no need to pass two map as parameters). Another reason is later for the non-call anchor matching, the non-call matching is based on call anchor(by the lexical order), so we regardlessly need to sort the non-call/call location in one container, then it's easy to use the map(sorted at the beginning). 

Yeah, it's probably hard to merge them, in summary, map is used for deduplication and sorting, and list is used for random access and filtering the non-call. 

https://github.com/llvm/llvm-project/pull/87375