[PATCH] D158817: [CSSPGO] Refactoring populateIRLocations and countProfileMismatches

Lei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 25 13:22:01 PDT 2023


wlei added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:464
+  // target.
+  static constexpr const char *IndirectCalleeName = "_indirect_call_";
+
----------------
wenlei wrote:
> this can actually be a valid c mangle name.. we need something more distinctive..  `*UnkownIndirectCallee`? 
Good catch, changed use "dot" as the separate char, that should not be used as c/c++ function name.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2214
+    bool CallsiteIsMatched = false;
+    // Since indirect call does not have the CalleeName, check
+    // conservatively if callsite in the profile is a callsite location.
----------------
wenlei wrote:
> nit: the CalleeName -> CalleeName.
done.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2216
+    // conservatively if callsite in the profile is a callsite location.
+    // This is to avoid nums of false positive since otherwise all the
+    // indirect call samples will be reported as mismatching.
----------------
wenlei wrote:
> nit: avoid nums of false positive ->
> 
> reduce num of false positives
> 
>  or 
> 
> avoid false positive
done


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2263
 void SampleProfileMatcher::populateProfileCallsites(
-    const FunctionSamples &FS,
+    const std::map<LineLocation, StringSet<>> ProfileLocations,
     StringMap<std::set<LineLocation>> &CalleeToCallsitesMap) {
----------------
wenlei wrote:
> This is passed by value, missing `&`? 
Good catch, removed `populateProfileCallsites `


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2379-2416
   std::map<LineLocation, StringRef> IRLocations;
-  // Extract profile matching anchors and profile mismatch metrics in the IR.
-  populateIRLocations(F, FS, MatchedCallsiteLocs, IRLocations);
+  populateIRLocations(F, IRLocations);
+  // Anchors for profile. it's a map from callsite location to set of callee
+  // name.
+  std::map<LineLocation, StringSet<>> ProfileLocations;
+  populateProfileLocations(FS, ProfileLocations);
 
----------------
wenlei wrote:
> Trying to organize the code a bit and keep high level flow clean, how about this:
> 
> ```
> // ..
> std::map<LineLocation, StringRef> IRAnchors;
> findIRAnchors(FS, IRAnchors);
> 
> // ..
> std::map<LineLocation, StringSet<>> ProfileAnchors;
> findProfileAnchors(FS, ProfileLocations);
> 
> // Detect profile mismatch for profile staleness metrics report.
> if (ReportProfileStaleness || PersistProfileStaleness) {
>   countProfileMismatches(FS, IRAnchors, ProfileAnchors, ...)
> }
> 
> if (IsFuncHashMismatch && SalvageStaleProfile) {
>    runStaleProfileMatching(IRAnchors, ProfileAnchors, getIRToProfileLocationMap(F));
> }
> ```
> 
> Accordingly, we would need to move `populateProfileCallsites` in `runStaleProfileMatching`. Maybe even inline expand `runStaleProfileMatching` there without a function call since the function is now simplified? Also move all statistics into `countProfileMismatches`.
> 
> Inside `runStaleProfileMatching`, we'd also rename `ProfileAnchors` to `MatchedAnchors`. 
That would looks much cleaner, thanks for the suggestion! done accordingly.

Only one things is that the name `IRAnchors`, it is not only the `Anchors`, we also need the non-call locations whose callee name is empty. but I understand separating the non-call locations and the anchors doesn't look neat. I don't have strong option on this, maybe this is straightforward to understand.


================
Comment at: llvm/test/Transforms/SampleProfile/profile-mismatch.ll:27
 ; CHECK-ASM: .byte 4
-; CHECK-ASM: .ascii  "MTU="
+; CHECK-ASM: .ascii  "MjU="
 ; CHECK-ASM: .byte 20
----------------
wenlei wrote:
> what causes this change? 
this is the same reason as the line 9. 
> "I changed to use the totalSamples instead of headerSamples for the mismatched inlined samples, since we switched to use the original nested profile, using totalSamples should makes more sense."


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158817/new/

https://reviews.llvm.org/D158817



More information about the llvm-commits mailing list