[PATCH] D100993: [CSSPGO] Fix incorrect prorating indirect call distribution factor that leads to target count loss.

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 21 13:09:39 PDT 2021


hoy created this revision.
Herald added subscribers: wenlei, hiraditya.
hoy requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Pseudo probe distribution factor is used to scale down profile samples to avoid misleading the counts inference due to the usage of "maximum" in `getBlockWeight`. For callsites, the scaling down can come from code duplication happening beforethe sample profile loader (prelink or postlink), or due to the indirect call promotion in sample loader inliner. This patch fixes an issue in sample loader ICP where the leftover indirect callsite scaling down happens too early. The early scale down causes the loss of non-promoted call target samples unexpectedly. To fix this, I'm postponing the prorating of the indirect callsite (not promoted direct callsites) to be in `updateIDTMetaData` based on the previous work that distinguised promoted targets from non-promoted targets.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D100993

Files:
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/SampleProfile/pseudo-probe-icp-factor.ll


Index: llvm/test/Transforms/SampleProfile/pseudo-probe-icp-factor.ll
===================================================================
--- llvm/test/Transforms/SampleProfile/pseudo-probe-icp-factor.ll
+++ llvm/test/Transforms/SampleProfile/pseudo-probe-icp-factor.ll
@@ -197,6 +197,8 @@
 !48 = !DILocation(line: 14, column: 10, scope: !35)
 !49 = !DILocation(line: 14, column: 12, scope: !35)
 !50 = !DILocation(line: 14, column: 10, scope: !51)
+;; A discriminator of 108527639 which is 0x6780017 in hexdecimal, stands for an indirect call probe
+;; with an index of 2 and probe factor of 0.79.
 !51 = !DILexicalBlockFile(scope: !35, file: !1, discriminator: 108527639)
 !52 = !DILocation(line: 14, column: 3, scope: !35)
 !53 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 17, type: !54, scopeLine: 18, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !56)
@@ -231,6 +233,8 @@
 !82 = !DILocation(line: 32, column: 17, scope: !75)
 !83 = !DILocation(line: 32, column: 20, scope: !75)
 !84 = !DILocation(line: 32, column: 13, scope: !85)
+;; A discriminator of 116916311 which is 0x6f80057 in hexdecimal, stands for an indirect call probe
+;; with an index of 10 and probe factor of 0.95.
 !85 = !DILexicalBlockFile(scope: !75, file: !1, discriminator: 116916311)
 !86 = !DILocation(line: 32, column: 11, scope: !75)
 !87 = !DILocation(line: 33, column: 5, scope: !75)
@@ -244,9 +248,13 @@
 !95 = !DILocation(line: 36, column: 1, scope: !53)
 !96 = !DILocation(line: 35, column: 5, scope: !53)
 
-; CHECK: %[[#]] = call i32 (i32, ...) %30(i32 %[[#]]) #[[#]], !dbg ![[#DBGID:]], !prof ![[#]]
+; CHECJ: define dso_local i32 @main
+; CHECK: %[[#]] = call i32 (i32, ...) %[[#]](i32 %[[#]]) #[[#]], !dbg ![[#DBGID:]], !prof ![[#PROF:]]
 
 ;; A discriminator of 69206039 which is 0x4200017 in hexdecimal, stands for an indirect call probe
 ;; with an index of 2 and probe factor of 0.04.
 ; CHECK: ![[#DBGID]] = !DILocation(line: [[#]], column: [[#]], scope: ![[#SCOPE:]], inlinedAt: ![[#]])
 ; CHECK: ![[#SCOPE]] = !DILexicalBlockFile(scope: ![[#]], file: ![[#]], discriminator: 69206039)
+;; The remaining count of the second target (bar) should be from the original count multiplied by two callsite
+;; factors, i.e, roughly 11259 * 0.95 * 0.79 = 8444.
+; CHECK: ![[#PROF]] = !{!"VP", i32 0, i64 8444, i64 7546896869197086323, i64 -1, i64 -2012135647395072713, i64 8444}
Index: llvm/lib/Transforms/IPO/SampleProfile.cpp
===================================================================
--- llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -760,6 +760,9 @@
   uint32_t NumVals = 0;
   // OldSum is the existing total count in the value profile data.
   uint64_t OldSum = 0;
+  // FullSum is the sum of counts of all the targets, including promoted
+  // targets.
+  uint64_t FullSum = Sum;
   std::unique_ptr<InstrProfValueData[]> ValueData =
       std::make_unique<InstrProfValueData[]>(MaxNumPromotions);
   bool Valid =
@@ -825,13 +828,21 @@
       std::min(NewCallTargets.size(), static_cast<size_t>(MaxNumPromotions));
   annotateValueSite(*Inst.getParent()->getParent()->getParent(), Inst,
                     NewCallTargets, Sum, IPVK_IndirectCallTarget, MaxMDCount);
+
+  // Prorate the indirect callsite distribution.
+  // FullSum is the sum including promoted targets, while sum is the sum
+  // excluding promoted targets.
+  if (FullSum)
+    if (Optional<PseudoProbe> Probe = extractProbe(Inst))
+      setProbeDistributionFactor(Inst, (Probe->Factor * Sum) / FullSum);
 }
 
 /// Attempt to promote indirect call and also inline the promoted call.
 ///
 /// \param F  Caller function.
 /// \param Candidate  ICP and inline candidate.
-/// \param Sum  Sum of target counts for indirect call.
+/// \param SumOrigin  Original sum of target counts for indirect call.
+/// \param Sum        Prorated sum of target counts for indirect call.
 /// \param InlinedCallSite  Output vector for new call sites exposed after
 /// inlining.
 bool SampleProfileLoader::tryPromoteAndInlineCandidate(
@@ -866,13 +877,11 @@
         CI, R->getValue(), Candidate.CallsiteCount, Sum, false, ORE);
     if (DI) {
       Sum -= Candidate.CallsiteCount;
-      // Prorate the indirect callsite distribution.
       // Do not update the promoted direct callsite distribution at this
       // point since the original distribution combined with the callee
       // profile will be used to prorate callsites from the callee if
       // inlined. Once not inlined, the direct callsite distribution should
       // be prorated so that the it will reflect the real callsite counts.
-      setProbeDistributionFactor(CI, static_cast<float>(Sum) / SumOrigin);
       Candidate.CallInstr = DI;
       if (isa<CallInst>(DI) || isa<InvokeInst>(DI)) {
         bool Inlined = tryInlineCandidate(Candidate, InlinedCallSite);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D100993.339361.patch
Type: text/x-patch
Size: 4925 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210421/fb51f529/attachment.bin>


More information about the llvm-commits mailing list