[llvm] [SamplePGO] Add a cutoff for number of profile matching anchors (PR #95542)

via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 14 06:25:12 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-pgo

Author: Krzysztof Pszeniczny (amharc)

<details>
<summary>Changes</summary>

The algorithm added by PR #<!-- -->87375 can be potentially quadratic in the number of anchors. This is almost never a problem because normally functions have a reasonable number of function calls.

However, in some rare cases of auto-generated code we observed very large functions that trigger quadratic behaviour here (resulting in >130GB of peak heap memory usage for clang). Let's add a knob for controlling the max number of callsites in a function above which stale profile matching won't be performed.

---
Full diff: https://github.com/llvm/llvm-project/pull/95542.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp (+18) 
- (modified) llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-matching-LCS.ll (+3) 


``````````diff
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index d7613bce4c52e..d484b53904801 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -14,6 +14,7 @@
 #include "llvm/Transforms/IPO/SampleProfileMatcher.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/MDBuilder.h"
+#include "llvm/Support/CommandLine.h"
 
 using namespace llvm;
 using namespace sampleprof;
@@ -24,6 +25,11 @@ extern cl::opt<bool> SalvageStaleProfile;
 extern cl::opt<bool> PersistProfileStaleness;
 extern cl::opt<bool> ReportProfileStaleness;
 
+static cl::opt<int> SalvageStaleProfileMaxCallsites(
+    "salvage-stale-profile-max-callsites", cl::Hidden, cl::init(INT_MAX),
+    cl::desc("The maximum number of callsites in a function, above which stale "
+             "profile matching will be skipped."));
+
 void SampleProfileMatcher::findIRAnchors(const Function &F,
                                          AnchorMap &IRAnchors) {
   // For inlined code, recover the original callsite and callee by finding the
@@ -300,6 +306,18 @@ void SampleProfileMatcher::runStaleProfileMatching(
   if (FilteredIRAnchorsList.empty() || FilteredProfileAnchorList.empty())
     return;
 
+
+  if (FilteredIRAnchorsList.size() > SalvageStaleProfileMaxCallsites ||
+      FilteredProfileAnchorList.size() > SalvageStaleProfileMaxCallsites) {
+    LLVM_DEBUG(dbgs() << "Skip stale profile matching for " << F.getName()
+                      << " because the number of callsites in the IR is "
+                      << FilteredIRAnchorsList.size()
+                      << " and in the profile is "
+                      << FilteredProfileAnchorList.size() << "\n");
+    return;
+  }
+
+
   // Match the callsite anchors by finding the longest common subsequence
   // between IR and profile. Note that we need to use IR anchor as base(A side)
   // to align with the order of IRToProfileLocationMap.
diff --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-matching-LCS.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-matching-LCS.ll
index ecf8484d98e59..4b8cd853301ed 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-matching-LCS.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-matching-LCS.ll
@@ -1,6 +1,7 @@
 ; REQUIRES: x86_64-linux
 ; REQUIRES: asserts
 ; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/pseudo-probe-stale-profile-matching-LCS.prof --salvage-stale-profile -S --debug-only=sample-profile,sample-profile-matcher,sample-profile-impl 2>&1 | FileCheck %s
+; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/pseudo-probe-stale-profile-matching-LCS.prof --salvage-stale-profile -S --debug-only=sample-profile,sample-profile-matcher,sample-profile-impl --salvage-stale-profile-max-callsites=6 2>&1 | FileCheck %s -check-prefix=CHECK-MAX-CALLSITES
 
 ; CHECK: Run stale profile matching for test_direct_call
 ; CHECK: Location is matched from 1 to 1
@@ -27,6 +28,8 @@
 ; CHECK: Callsite with callee:unknown.indirect.callee is matched from 9 to 6
 ; CHECK: Callsite with callee:C is matched from 10 to 7
 
+; CHECK-MAX-CALLSITES: Skip stale profile matching for test_direct_call
+; CHECK-MAX-CALLSITES-NOT: Skip stale profile matching for test_indirect_call
 
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"

``````````

</details>


https://github.com/llvm/llvm-project/pull/95542


More information about the llvm-commits mailing list