[PATCH] D147545: [CSSPGO] Stale profile matching(part 2)

Hongtao Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 19 15:55:09 PDT 2023


hoy added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2304
+        // Force to overwrite the callee name in case any non-call location was
+        // wrote before.
+        IRLocations[IRCallsite] = CalleeName;
----------------
wenlei wrote:
> hoy wrote:
> > wlei wrote:
> > > hoy wrote:
> > > > typo: wrote -> written
> > > > 
> > > > BTW, have you seen such collision? With pseudo probes this shouldn't happen. It should not happen with dwarf discriminators either.
> > > Before I didn't add `isa<IntrinsicInst>(&I)` condition to `(std::optional<PseudoProbe> Probe = extractProbe(I))`
> > > I saw some callee name become empty due to the overwriting by the above emplace.
> > > 
> > > So I was thinking this is to record all the anchors, the anchor should always be higher priority than non-anchor, so I changed like here to force the writing. Also in case any changes when we support AutoFDO or post-link time matching. Or we can use the assertion here.
> > > 
> > I see. Yeah, `extractProbe` on call instruction gets the call probe. If you want block probes only, you can change the check `isa<IntrinsicInst>` to `isa<PseudoProbeInst>`.
> > 
> > 
> For non-call location, the name should be empty. Maybe assert that we never overwrite non-empty name? 
An assert sounds good for pseudo probes. The overwrite is possible for autofdo without enabling dwarf discriminator. 


================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2334
+
+  if (IsFuncHashMismatch && SalvageStaleProfile) {
+    LLVM_DEBUG(dbgs() << "Run stale profile matching for " << F.getName()
----------------
hoy wrote:
> wenlei wrote:
> > wlei wrote:
> > > hoy wrote:
> > > > BTW, wondering if you've ever seen mismatched callsites when function hash matches. The hash counts number of callsites but not their orders.
> > > Good question. Yes, there are many mismatched callsites even hash is matched, current work only support when a checksum mismatch is detected. 
> > > 
> > > There is a general issue whether we can turn it on for all the functions. That is whether the matching algorithm can handle perfectly with the non-stale profile.  the current heuristic is "first come first match", but not all the functions are in the profile(supposing there are functions doesn't hit any samples), it could give inconsistent anchors for the non-stale profile then cause a mismatch.
> > > 
> > > In order to solve it, I think we can try:
> > > 
> > > - Use a more strict checksum, like also count the orders.
> > > - find a threshold from the mismatch metrics to control it.
> > > -  Use a different heuristic, like search the closest location which can handle well with non-stale profile, but need more measuring for the mismatched function.
> > > 
> > > This is also an issue blocking AutoFDO, since AutoFDO doesn't have the checksum.
> > > 
> > > 
> > > 
> > it might be useful to have a debug print, or `STATISTIC` so we know when and how often this happens. It's essentially hash collision. 
> Good point. Counting how many direct callsites having mismatched targets in the profile and on the IR would be helpful.
Also as discussed offline, since IR callsites are matched with profile callsites sequentially, it may be interesting to see how many same-named callsites are not able to get a match. If it's non-trivial, adding zero-count callsites in the profile may allow for a more precise match. Of course that's going to increase profile size, but just curious if it is worth.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147545/new/

https://reviews.llvm.org/D147545



More information about the llvm-commits mailing list