[llvm] [CSSPGO] Error out if the checksum mismatch is high (PR #84097)

via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 18 17:15:27 PDT 2024


================
@@ -2184,6 +2204,61 @@ bool SampleProfileLoader::doInitialization(Module &M,
   return true;
 }
 
+// Note that this is a module-level check. Even if one module is errored out,
+// the entire build will be errored out. However, the user could make big
+// changes to functions in single module but those changes might not be
+// performance significant to the whole binary. Therefore, we use a conservative
+// approach to make sure we only error out if it globally impacts the binary
+// performance. To achieve this, we use heuristics to select a reasonable
+// big set of functions that are supposed to be globally performance
+// significant, only compute and check the mismatch within those functions. The
+// function selection is based on two criteria: 1) The function is "hot" enough,
+// which is tuned by a hotness-based flag(ChecksumMismatchFuncHotBlockSkip). 2)
+// The num of function is large enough which is tuned by the
+// ChecksumMismatchNumFuncSkip flag.
+bool SampleProfileLoader::errorIfHighChecksumMismatch(
+    Module &M, ProfileSummaryInfo *PSI, const SampleProfileMap &Profiles) {
+  assert(FunctionSamples::ProfileIsProbeBased &&
+         "Only support for probe-based profile");
+  uint64_t TotalSelectedFunc = 0;
+  uint64_t NumMismatchedFunc = 0;
+  for (const auto &I : Profiles) {
+    const auto &FS = I.second;
+    const auto *FuncDesc = ProbeManager->getDesc(FS.getGUID());
+    if (!FuncDesc)
+      continue;
+
+    // We want to select a set of functions that are globally performance
+    // significant, in other words, if those functions profiles are
+    // checksum-mismatched and dropped, the whole binary will likely be
+    // impacted, so here we use a hotness-based threshold to control the
+    // selection.
+    if (FS.getTotalSamples() <
+        ChecksumMismatchFuncHotBlockSkip * PSI->getOrCompHotCountThreshold())
----------------
WenleiHe wrote:

> the isFunctionHotXXX function are used to check the entry_count of the function which is also "block-level"(the first block), however what I used here is the getTotalSamples() which I intend to include all the callee/inlinee samples(it could be the entry count is cold, but it does a lot of hot inlinings)

So there are different ways to decide whether a function is hot. Here we want to select a few hot functions and see how many of them has checksum mismatch. Checking on total samples is one way, and is different from existing isFunctionHotXXX. But do we really need to add another way of deciding whether a function is hot? I don't know the answer, but maybe use `isFunctionHotInCallGraphNthPercentile` to decide whether function is hot for our purpose is ok too? 

I agree that passing total samples into `isHotCountNthPercentile` maybe not be a good idea since the latter expects block counts. 

https://github.com/llvm/llvm-project/pull/84097


More information about the llvm-commits mailing list