[llvm] 79b32bc - [MemProf] Strip callsite metadata when inlining an unprofiled callsite (#110998)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 3 08:07:00 PDT 2024
Author: Teresa Johnson
Date: 2024-10-03T08:06:56-07:00
New Revision: 79b32bcda662a3e7789ad2835a021020fd2a5158
URL: https://github.com/llvm/llvm-project/commit/79b32bcda662a3e7789ad2835a021020fd2a5158
DIFF: https://github.com/llvm/llvm-project/commit/79b32bcda662a3e7789ad2835a021020fd2a5158.diff
LOG: [MemProf] Strip callsite metadata when inlining an unprofiled callsite (#110998)
We weren't flagging inlined callee functions with callsite but not
memprof metadata correctly, leading to the callsite metadata not being
stripped when that function was inlined into a callsite that didn't
itself have callsite metadata.
In practice, this meant that we went into the LTO link with many more
calls than necessary having callsite metadata / summary records, which
in turn made the graph larger than necessary.
Fixing this oversight resulted in huge reductions in the thin link of a
large target:
99% fewer duplicated context ids (recall we have to duplicate when
callsites containing the same stack ids are in different functions)
71% fewer graph edges
17% fewer graph nodes
13% fewer functions cloned
44% smaller peak memory
47% smaller time
Added:
Modified:
llvm/lib/Transforms/Utils/CloneFunction.cpp
llvm/test/Transforms/Inline/memprof_inline2.ll
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Utils/CloneFunction.cpp b/llvm/lib/Transforms/Utils/CloneFunction.cpp
index dc9ca1423f3e79..fc03643e3542cc 100644
--- a/llvm/lib/Transforms/Utils/CloneFunction.cpp
+++ b/llvm/lib/Transforms/Utils/CloneFunction.cpp
@@ -70,6 +70,7 @@ BasicBlock *llvm::CloneBasicBlock(const BasicBlock *BB, ValueToValueMapTy &VMap,
if (isa<CallInst>(I) && !I.isDebugOrPseudoInst()) {
hasCalls = true;
hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_memprof);
+ hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_callsite);
}
if (const AllocaInst *AI = dyn_cast<AllocaInst>(&I)) {
if (!AI->isStaticAlloca()) {
@@ -556,6 +557,7 @@ void PruningFunctionCloner::CloneBlock(
if (isa<CallInst>(II) && !II->isDebugOrPseudoInst()) {
hasCalls = true;
hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_memprof);
+ hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_callsite);
}
CloneDbgRecordsToHere(NewInst, II);
diff --git a/llvm/test/Transforms/Inline/memprof_inline2.ll b/llvm/test/Transforms/Inline/memprof_inline2.ll
index 625971c5fa37f6..21448f142ed079 100644
--- a/llvm/test/Transforms/Inline/memprof_inline2.ll
+++ b/llvm/test/Transforms/Inline/memprof_inline2.ll
@@ -90,10 +90,18 @@ entry:
; CHECK-LABEL: define dso_local noundef ptr @notprofiled
define dso_local noundef ptr @notprofiled() #0 !dbg !66 {
entry:
+ ;; When foo is inlined, both the memprof and callsite metadata should be
+ ;; stripped from the inlined call to new, as there is no callsite metadata on
+ ;; the call.
; CHECK: call {{.*}} @_Znam
; CHECK-NOT: !memprof
; CHECK-NOT: !callsite
%call = call noundef ptr @_Z3foov(), !dbg !67
+ ;; When baz is inlined, the callsite metadata should be stripped from the
+ ;; inlined call to foo2, as there is no callsite metadata on the call.
+ ; CHECK: call {{.*}} @_Z4foo2v
+ ; CHECK-NOT: !callsite
+ %call2 = call noundef ptr @_Z3bazv()
; CHECK-NEXT: ret
ret ptr %call, !dbg !68
}
More information about the llvm-commits
mailing list