[llvm] 79b32bc - [MemProf] Strip callsite metadata when inlining an unprofiled callsite (#110998)

via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 3 08:07:00 PDT 2024


Author: Teresa Johnson
Date: 2024-10-03T08:06:56-07:00
New Revision: 79b32bcda662a3e7789ad2835a021020fd2a5158

URL: https://github.com/llvm/llvm-project/commit/79b32bcda662a3e7789ad2835a021020fd2a5158
DIFF: https://github.com/llvm/llvm-project/commit/79b32bcda662a3e7789ad2835a021020fd2a5158.diff

LOG: [MemProf] Strip callsite metadata when inlining an unprofiled callsite (#110998)

We weren't flagging inlined callee functions with callsite but not
memprof metadata correctly, leading to the callsite metadata not being
stripped when that function was inlined into a callsite that didn't
itself have callsite metadata.

In practice, this meant that we went into the LTO link with many more
calls than necessary having callsite metadata / summary records, which
in turn made the graph larger than necessary.

Fixing this oversight resulted in huge reductions in the thin link of a
large target:
99% fewer duplicated context ids (recall we have to duplicate when
callsites containing the same stack ids are in different functions)
71% fewer graph edges
17% fewer graph nodes
13% fewer functions cloned
44% smaller peak memory
47% smaller time

Added: 
    

Modified: 
    llvm/lib/Transforms/Utils/CloneFunction.cpp
    llvm/test/Transforms/Inline/memprof_inline2.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Utils/CloneFunction.cpp b/llvm/lib/Transforms/Utils/CloneFunction.cpp
index dc9ca1423f3e79..fc03643e3542cc 100644
--- a/llvm/lib/Transforms/Utils/CloneFunction.cpp
+++ b/llvm/lib/Transforms/Utils/CloneFunction.cpp
@@ -70,6 +70,7 @@ BasicBlock *llvm::CloneBasicBlock(const BasicBlock *BB, ValueToValueMapTy &VMap,
     if (isa<CallInst>(I) && !I.isDebugOrPseudoInst()) {
       hasCalls = true;
       hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_memprof);
+      hasMemProfMetadata |= I.hasMetadata(LLVMContext::MD_callsite);
     }
     if (const AllocaInst *AI = dyn_cast<AllocaInst>(&I)) {
       if (!AI->isStaticAlloca()) {
@@ -556,6 +557,7 @@ void PruningFunctionCloner::CloneBlock(
     if (isa<CallInst>(II) && !II->isDebugOrPseudoInst()) {
       hasCalls = true;
       hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_memprof);
+      hasMemProfMetadata |= II->hasMetadata(LLVMContext::MD_callsite);
     }
 
     CloneDbgRecordsToHere(NewInst, II);

diff  --git a/llvm/test/Transforms/Inline/memprof_inline2.ll b/llvm/test/Transforms/Inline/memprof_inline2.ll
index 625971c5fa37f6..21448f142ed079 100644
--- a/llvm/test/Transforms/Inline/memprof_inline2.ll
+++ b/llvm/test/Transforms/Inline/memprof_inline2.ll
@@ -90,10 +90,18 @@ entry:
 ; CHECK-LABEL: define dso_local noundef ptr @notprofiled
 define dso_local noundef ptr @notprofiled() #0 !dbg !66 {
 entry:
+  ;; When foo is inlined, both the memprof and callsite metadata should be
+  ;; stripped from the inlined call to new, as there is no callsite metadata on
+  ;; the call.
   ; CHECK: call {{.*}} @_Znam
   ; CHECK-NOT: !memprof
   ; CHECK-NOT: !callsite
   %call = call noundef ptr @_Z3foov(), !dbg !67
+  ;; When baz is inlined, the callsite metadata should be stripped from the
+  ;; inlined call to foo2, as there is no callsite metadata on the call.
+  ; CHECK: call {{.*}} @_Z4foo2v
+  ; CHECK-NOT: !callsite
+  %call2 = call noundef ptr @_Z3bazv()
   ; CHECK-NEXT: ret
   ret ptr %call, !dbg !68
 }


        


More information about the llvm-commits mailing list