[llvm] [MemProf] Fix callee guid for non-leaf frame (PR #172502)

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 16 07:40:24 PST 2025


https://github.com/teresajohnson created https://github.com/llvm/llvm-project/pull/172502

When matching callsite profile info, we synthesize VP metadata for
matched indirect calls from the CalleeGuids recorded with the CallSite
profile info. However, those are the callee guids of the leaf-most frame
in the callsite. In cases where we match to a portion of the frames, not
including the leaf, the callee guid should instead be synthesized from
the next leaf-most frame in the list.

This addresses the case where indirect call promotion was applied in the
profiled binary during SamplePGO matching in a ThinLTO backend, where we
didn't have VP metadata.


>From dde91a247b9b9b1f7185d668bf935ef1697328d9 Mon Sep 17 00:00:00 2001
From: Teresa Johnson <tejohnson at google.com>
Date: Tue, 16 Dec 2025 07:36:41 -0800
Subject: [PATCH] [MemProf] Fix callee guid for non-leaf frame

When matching callsite profile info, we synthesize VP metadata for
matched indirect calls from the CalleeGuids recorded with the CallSite
profile info. However, those are the callee guids of the leaf-most frame
in the callsite. In cases where we match to a portion of the frames, not
including the leaf, the callee guid should instead be synthesized from
the next leaf-most frame in the list.

This addresses the case where indirect call promotion was applied in the
profiled binary during SamplePGO matching in a ThinLTO backend, where we
didn't have VP metadata.
---
 .../Transforms/Instrumentation/MemProfUse.cpp | 10 +++++++--
 .../memprof_annotate_indirect_call.test       | 21 +++++++++++++++++--
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
index 5cde52637248a..83267e5caead3 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
@@ -702,8 +702,14 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
     for (auto &StackFrame : CS.Frames) {
       uint64_t StackId = computeStackId(StackFrame);
       ArrayRef<Frame> FrameSlice = ArrayRef<Frame>(CS.Frames).drop_front(Idx++);
-      ArrayRef<GlobalValue::GUID> CalleeGuids(CS.CalleeGuids);
-      LocHashToCallSites[StackId].push_back({FrameSlice, CalleeGuids});
+      // The callee guids for the slice containing all frames (due to the
+      // increment above Idx is now 1) comes from the CalleeGuids recorded in
+      // the CallSite. For the slices not containing the leaf-most frame, the
+      // callee guid is simply the function GUID if the prior frame.
+      LocHashToCallSites[StackId].push_back(
+          {FrameSlice, (Idx == 1 ? CS.CalleeGuids
+                                 : ArrayRef<GlobalValue::GUID>(
+                                       CS.Frames[Idx - 2].Function))});
 
       ProfileHasColumns |= StackFrame.Column;
       // Once we find this function, we can stop recording.
diff --git a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
index 15a315543056a..4677eebff3cfe 100644
--- a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
+++ b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
@@ -30,6 +30,17 @@ HeapProfileRecords:
           - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
           - { Function: _Z3foov, LineOffset: 10, Column: 7, IsInlineFrame: false }
         CalleeGuids:   [0x3456, 0x4567]
+      # This set of frames is for the second indirect call below. We should match
+      # the interior non-leaf frame with the call. In this case the synthesized
+      # VP metadata should have _Z3xyzv (GUID 15367485273663173088 aka
+      # -3079258800046378528) as the target, not the one from CalleeGuids which
+      # is the callee of the leafmost frame. This simulates the case where sample
+      # PGO performed ICP during matching in the profiled compile, without using
+      # VP metadata.
+      - Frames:
+          - { Function: _Z3xyzv, LineOffset: 1, Column: 1, IsInlineFrame: true }
+          - { Function: _Z3barv, LineOffset: 4, Column: 10, IsInlineFrame: false }
+        CalleeGuids:   [0x5678]
 ...
 
 ;--- basic.ll
@@ -38,12 +49,17 @@ entry:
   %fp = alloca ptr, align 8
   %0 = load ptr, ptr %fp, align 8
   call void %0(), !dbg !5
-; CHECK-ENABLE: call void %0(), {{.*}} !prof !6
+; CHECK-ENABLE: call void %0(), {{.*}} !prof ![[VP1:[0-9]+]]
 ; CHECK-DISABLE-NOT: !prof
+  call void %0(), !dbg !6
+; CHECK-ENABLE: call void %0(), {{.*}} !prof ![[VP2:[0-9]+]]
   ret void
 }
 
-; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 6, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 13398, i64 1, i64 17767, i64 1}
+; CHECK-ENABLE: ![[VP1]] = !{!"VP", i32 0, i64 6, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 13398, i64 1, i64 17767, i64 1}
+;; The second call above gets a single target synthesized from the callee frame,
+;; to the GUID of _Z3xyzv, see comments in the profile above.
+; CHECK-ENABLE: ![[VP2]] = !{!"VP", i32 0, i64 1, i64 -3079258800046378528, i64 1}
 
 !llvm.module.flags = !{!2, !3}
 
@@ -53,6 +69,7 @@ entry:
 !3 = !{i32 2, !"Debug Info Version", i32 3}
 !4 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !1, file: !1, line: 1, unit: !0)
 !5 = !DILocation(line: 4, column: 5, scope: !4)
+!6 = !DILocation(line: 5, column: 10, scope: !4)
 
 ;--- fdo_conflict.yaml
 ---



More information about the llvm-commits mailing list