[PATCH] D136627: [SampleFDO] Compute profile mismatch metrics

Mon Oct 24 20:33:48 PDT 2022

wenlei added inline comments.

================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2142
+
+  LLVM_DEBUG({
+    if (FunctionSamples::ProfileIsProbeBased) {
----------------
wlei wrote:
> wenlei wrote:
> > When the flag is on, we should always emit the messages, regardless of debug build or not. Same for the one below. 
> Sounds good! Changed to use `outs`
I don't know if there's well established convention, but I think most of similar dump go to stderr (i.e. all IR dump goes to stderr). 

================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:2106
+    const LineLocation &Loc = I.first;
+    if (Loc.LineOffset & InvalidLineOffsetBit)
+      continue;
----------------
hoy wrote:
> wlei wrote:
> > hoy wrote:
> > > Should invalid line offset also be considered mismatched samples? The samples will discarded any way. 
> > > 
> > > BTW, sort of having an impression that the invalid line offset checking is used anywhere else. Can we unify the usage if that's the case?
> > From my observation, the invalid(negative) offset will never be matched to any offset in the IR, so we actually need to remove it from the profile rather than report as mismatching. Recalling we tried to remove this offset in llvm-profgen, however, it caused a regression(likely because it affect the sample hot cutoff), so we chose to keep it. As for the mismatch,  I'm thinking to report the samples that's caused by the real stale profile issue, this might mislead the user. 
> > 
> > > BTW, sort of having an impression that the invalid line offset checking is used anywhere else. Can we unify the usage if that's the case?
> > yeah, that's from llvm-profgen, but that diff is reverted.
> Sounds good to exclude invalid line offsets.
> 
> 
> > yeah, that's from llvm-profgen, but that diff is reverted.
> 
> I see. Factor out the checking logic here to be a general function, maybe a static function of `LineLocation`?
IIRC, these represents sample that can't be attributed to a particular line, but excluding them in profile generation changes total sample of a function, which led to regression. 

I think excluding them in mismatch count makes sense because there's not much user can do about these, i.e. a profile refresh won't make them go away. 

================
Comment at: llvm/lib/Transforms/IPO/SampleProfile.cpp:431
+  uint64_t TotalCallsiteSamples = 0;
+  uint64_t MismatchedFuncHashSamples = 0;
+  uint64_t TotalFuncHashSamples = 0;
----------------
wlei wrote:
> wenlei wrote:
> > How about having another metric to represent how many functions has mismatched profile? That should tell the breath of the mismatch while your current metric tells the severity of mismatch. 
> Sounds good.(We already have the same stats(https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L1752) but it's only available with the `-stats`).
Sorry I wasn't explicit enough, but I was thinking about doing the same for both call site and function. I.e. have both samples mismatch and number of function/callsite mismatch. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136627/new/

https://reviews.llvm.org/D136627