[PATCH] D147545: [CSSPGO] Stale profile matching(part 2)

Lei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 6 15:41:56 PDT 2023


wlei added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:1846
       ++NumMismatchedProfile;
-      return false;
+      if (!SalvageStaleProfile)
+        return false;
----------------
davidxl wrote:
> nit: probably rename the option to be 'resync-stale-profile' to make the meaning more accurate.
Thanks for the name suggestion. so "resync the profile according to current build source code", it sounds good to me, @wenlei see if you are good for this name since you suggested the salvage-stale-profile 


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2161
+// Populate the anchors from profile.
+void SampleProfileMatcher::populateProfileCallsites(
+    const FunctionSamples &FS,
----------------
davidxl wrote:
> The mapping from callee to location is 1 to many. Should that be handled with sequential id (per callee)?
> The mapping from callee to location is 1 to many. Should that be handled with sequential id (per callee)?
Here is also `1 to many` map, see `StringMap<std::set<LineLocation>> &CalleeToCallsitesMap` which is a callee name string to a set of location.
So with no callsite changes, this works the same as using sequential id, but  with callsite change, this should be more resilient. Say if we add one or delete the first callsite, all following callsite would be mismatched if using  sequential id match, but for using this callsite name match, all other callsite with different name won't be affected.



================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2237-2243
+    if (!IsMatched) {
+      uint32_t CandidateLineOffset = Loc.LineOffset + LocationDelta;
+      LineLocation Candidate(CandidateLineOffset, Loc.Discriminator);
+      InsertMatching(Loc, Candidate);
+      LLVM_DEBUG(dbgs() << "Location is matched from " << Loc << " to "
+                        << Candidate << "\n");
+    }
----------------
wenlei wrote:
> wlei wrote:
> > wenlei wrote:
> > > This doesn't seem to do what you intended to achieve, or at least what the comment says
> > > 
> > > > non-anchor location match is based on the offset to the last matched anchor.
> > > 
> > > If last matched anchor doesn't exist, should we still match anything? 
> > > 
> > > Consider a case where a function is completely changed, and there's no matched callsite at call, in which case we should probably still throw away its profile instead of matching everything with LocationDelta == 0?
> > > 
> > > 
> > Sorry if my description is unclear.
> > 
> > My implementation assumes a special non-callsite anchor: `0`, i.e. the beginning of the function. Without the fuzzy matching, it's just this only one anchor matched, and all the others(non-anchor) use the  `LocationDelta = 0` to do the matching, which  means the match source is equal to the target.
> > 
> > >Consider a case where a function is completely changed, and there's no matched callsite at call, in which case we should probably still throw away its profile instead of matching everything with LocationDelta == 0?
> > 
> > Say:
> > ```
> > IR locations:      [1...100 101(foo)]
> > Profile locations: [1...100 101(bar)]
> > ```
> > So my intuition is even without any callsite anchor, matching the leading location would be better than dropping them. (remembering current pseudo-probe implementation, the bb ids all come first, it's not uncommon)
> > 
> > 
> > 
> Hmm.. I thought if there's no matching call site, that would mean the function has completely changed, so we can't match probes/blocks in a reasonable way. I can see what you have can help matching leaf functions without call sites.
> 
> But there needs to be something to make sure when the change is too big, we bail out instead of doing low confidence matching. Maybe we should bail out on zero matched call site + different number of probes/blocks? 
Good point, I will add some check(like a threshold) to make sure the very bad match to be bailed out and drop the profiles.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147545/new/

https://reviews.llvm.org/D147545



More information about the llvm-commits mailing list