[PATCH] D140063: [AutoFDO] Use getHeadSamplesEstimate instead of getTotalSamples to compute profile callsite staleness

Thu Dec 15 10:59:45 PST 2022

wlei added a comment.

In D140063#3998437 <https://reviews.llvm.org/D140063#3998437>, @wenlei wrote:

> In D140063#3998407 <https://reviews.llvm.org/D140063#3998407>, @hoy wrote:
>
>> In D140063#3998381 <https://reviews.llvm.org/D140063#3998381>, @wlei wrote:
>>
>>> In D140063#3998253 <https://reviews.llvm.org/D140063#3998253>, @wenlei wrote:
>>>
>>>> I thought the idea is to compute the % of samples being dropped due to mismatch? in this case, all samples from callsite will be dropped, so I actually don't see a problem with using getTotalSamples. Yes you can argue that they can be merged with base profile, but then that argument applies to getHeadSamplesEstimate too?
>>>
>>> Yeah, it's tricky to quantify the dropped callsite samples's impact. I was thinking in a reversed way, if we use `getTotalSamples`, I feel like the total samples are completely dropped(not merged into top-level profile), in fact it's still in use. And for `getHeadSamplesEstimate`, I feel it's like only the callsite call/jump's samples are dropped, but I admit that this's also not accurate, the missing inlining from big total samples could affect more on perf. I don't have strong opinion on this.
>>
>> Reporting mismatched samples based on `getHeadSamplesEstimate` sounds a bit more accurate to me, thought it's not perfect either. The direct effect of mismatched callsite samples is likely a missing inlining. Reporting callee total samples for this may not give a good signal, especially when the callee is really big but the callsite isn't very hot.
>>
>> We have some internal services showing a very high callsite mismatch rate like 30%. Wondering if that could be related.
>
> Logically I think a call site focused metric makes sense. But the way CallsiteSamples is defined led people to think this is total samples. I think perhaps we just need to be explicit in the naming, so it's clear that we're tracking call site counts, not call site samples..

Sounds good to change to the callsite counts. CallsiteSamples in the nested profile is indeed the total samples of the callee, which is a confusing name.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140063/new/

https://reviews.llvm.org/D140063