[PATCH] D152399: [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO.
Han Shen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 29 15:35:32 PDT 2023
shenhan added inline comments.
================
Comment at: llvm/lib/CodeGen/MachineFunctionSplitter.cpp:232
+llvm::createMachineFunctionSplitterPass(bool HasAccurateProfile) {
+ return new MachineFunctionSplitter(HasAccurateProfile);
}
----------------
snehasish wrote:
> Do we need to pass this in explicitly anymore? Can we instead infer this locally from ProfileSummaryInfo in runOnMachineFunction?
>
> I think the following is equivalent to the current behaviour
>
> ```
> bool HasAccurateProfile = PSI->hasInstrumentationProfile() || PSI->hasCSInstrumentationProfile();
> ```
Good point. Removed.
================
Comment at: llvm/lib/CodeGen/MachineFunctionSplitter.cpp:118
if (PercentileCutoff > 0) {
return PSI->isColdCountNthPercentile(PercentileCutoff, *Count);
----------------
hoy wrote:
> shenhan wrote:
> > hoy wrote:
> > > Looks like this is the logic the `HasAccurateSampleProfile` check is bypassing. Is the `PercentileCutoff` count is too high in your case? IIUC, you want more blocks to be treated as cold to lower the splitting cost. A higher cutoff count should help. Am I missing anything?
> > >
> > > BTW, a more accurate profile should be more trustful, and thus the existing logic should work better. The term `HasAccurateSampleProfile` to bypass the existing logic here is a bit confusing to me.
> > Yes, you are right. Sorry, this is a bit confusing. The origin name was IsUsingFSAFDO, and was later renamed to HasAccurateSampleProfile per reviewer's suggestion. And it should equal to the negated value of IsUsingFSAFDO.
> >
> > So replaced HasAccurateProfile with !HasAccurateProfile and negated the initial value of HasAccurateProfile and updated the comment.
> Thanks for the clarification. Another question, is it intentional to always bypass `PercentileCutoff` and use `ColdCountThreshold` instead for FS-AFDO?
Yes, currently for FSAFDO, we always use ColdCountThreshold instead of PercentileCutoff.
================
Comment at: llvm/lib/CodeGen/MachineFunctionSplitter.cpp:111
+ if (IsFSAFDOFlavor)
+ return Count.has_value() && Count.value() == 0;
+
----------------
davidxl wrote:
> wenlei wrote:
> > shenhan wrote:
> > > wenlei wrote:
> > > > shenhan wrote:
> > > > > shenhan wrote:
> > > > > > wenlei wrote:
> > > > > > > snehasish wrote:
> > > > > > > > Can we use `< ColdCountThreshold` instead of `== 0`?
> > > > > > > >
> > > > > > > > `ColdCountThreshold` defaults to 1 so it should have the same effect.
> > > > > > > So for IRPGO, we want to treat unknown as zero/cold, but not so for FSAFDO? Why is the difference?
> > > > > > >
> > > > > > > IIRC after profile inference, not sampled block will still get zero counts. What produces `Count.has_value() == false`?
> > > > > > > So for IRPGO, we want to treat unknown as zero/cold, but not so for FSAFDO? Why is the difference?
> > > > > >
> > > > > > Yes, the reason is that when IRPGO dictates a zero block counter, we trust it with higher confidence, but less so for FSAFDO. The latter could be a miss sampling or inference / propagation error. Since the cost of splitting a non-cold block is high, we only split when we are sure of the block counter.
> > > > > >
> > > > > > > IIRC after profile inference, not sampled block will still get zero counts. What produces Count.has_value() == false?
> > > > > > You are right. This is an extra caution in (unlikely) case when Count has no value and we don't want an exception to be thrown.
> > > > > Done.
> > > > > Yes, the reason is that when IRPGO dictates a zero block counter, we trust it with higher confidence, but less so for FSAFDO. The latter could be a miss sampling or inference / propagation error.
> > > >
> > > > But checking `has_value` isn't going to be able to differentiate inaccurate counts.
> > > >
> > > > > You are right. This is an extra caution in (unlikely) case when Count has no value and we don't want an exception to be thrown.
> > > >
> > > > In this case, we don't need to guard it under `HasAccurateProfile` as it applies to IRPGO as well.
> > > Thanks. I made the following change:
> > >
> > > Firstly, per hoy's comment, I replaced "if (HasAccurateProfile) {" with "if (!HasAccurateProfile) {", noticing that HasAccurateProfile should be False when using FSAFDO or CSSPGO, and True when using IRPGO. Previously the value is negated.
> > >
> > > Secondly, when !HasAccurateProfile && !Count.has_value() (although the latter is unlikely), this needs to return false. Whereas when HasAccurateProfile && !Count.has_value(), this returns true. So I have to guard it with HasAccureateProfile.
> > >
> > > Thirdly, the next line "if (!Count)" is not testing "if (Count !=0)", it is testing "if(!Count.has_value())", it is little bit misleading, so I changed it to if(!Count.has_value()), so it is clear why this is guarded under HasAccurateProfile.
> > >
> > > Hope this is clearer.
> > If the key difference is how we treat unknown/missing counts, I'd suggest let's make it explicit.
> >
> > Would it work if we just make a simple and explicit change:
> >
> > ```
> > if (!Count)
> > return true;
> > ```
> > -->
> > ```
> > // Treat unknown/missing counts as cold if profile is accurate, but not if profile is inaccurate.
> > if (!Count.has_value()) {
> > return HasAccurateProfile;
> > }
> > ```
> > If the key difference is how we treat unknown/missing counts, I'd suggest let's make it explicit.
> >
> > Would it work if we just make a simple and explicit change:
> >
> > ```
> > if (!Count)
> > return true;
> > ```
> > -->
> > ```
> > // Treat unknown/missing counts as cold if profile is accurate, but not if profile is inaccurate.
> > if (!Count.has_value()) {
> > return HasAccurateProfile;
> > }
> > ```
>
>
> Sounds reasonable.
>
> Missing count in IRPGO does mean very cold (module not even linked into the binary), but for sample PGO, it means unknown -- e.g. new code added which is not sampled. In this sense, their handling should be different.
```
// Treat unknown/missing counts as cold if profile is accurate, but not if profile is inaccurate.
if (!Count.has_value()) {
return HasAccurateProfile;
}
```
I see the point here. However for NonAccurateProfile, we test that Count < ColdCountThreshold, whereas for AccurateProfile, we test that using "PSI->isColdCountNthPercentile(PercentileCutoff, *Count);",the former is more rigorous, resulting in less cold block being split than the HasAccurateProfile. With this difference, we still need to set differentiate these 2 cases by "if (HasAccureateProfile) { } else {}" like below:
```
if (!Count.has_value()) {
return HasAccurateProfile;
}
if (!HasAccureateProfile) {
return *Count < ColdCountThreshold;
}
if (PercentileCutoff > 0) {
return PSI->isColdCountNthPercentile(PercentileCutoff, *Count);
}
return (*Count < ColdCountThreshold);
// end of function
```
Do you think we shall use the above version or keep the current one?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152399/new/
https://reviews.llvm.org/D152399
More information about the llvm-commits
mailing list