[PATCH] D98213: [InlineCost] Enable the cost benefit analysis on FDO

Fri Jan 20 15:02:14 PST 2023

aemerson added a comment.

In D98213#4070341 <https://reviews.llvm.org/D98213#4070341>, @kazu wrote:

>> The main one being that the heuristics seem to completely override all sensible thresholds for code size. The feature caims to compare the performance gains by measuring the cycles saved from a PGO run, and comparing against some threshold. This doesn't make sense since the length of the runs for PGO are arbitrary, and you shouldn't be using them in absolute terms. The only way it makes sense is if you have a very specific workload and the PGO profile run is exactly the same as the real world runs.
>
> If you look at `llvm/lib/Analysis/InlineCost.cpp:892`, you'll see:
>
>   //  CycleSavings      PSI->getOrCompHotCountThreshold()                                                                                       
>   // -------------- >= -----------------------------------                                                                                      
>   //       Size              InlineSavingsMultiplier 
>
> If you rearrange this inequality, we have:
>
>   //            CycleSavings                          Size                                                                                      
>   // ----------------------------------- >= -------------------------                                                                           
>   //  PSI->getOrCompHotCountThreshold()      InlineSavingsMultiplier
>
> Notice that the LHS divides `CycleSavings` by `PSI->getOrCompHotCountThreshold()`, so I am not using cycle savings in absolute terms.  If I do a PGO run twice as long, both the numerator and denominator would double, yielding the same ratio.
>
>> The original change gave absolutely no justification, formal or intuitive, for the use of the equation to determine profitability. Giving positive runtime results is no justification for completely overriding any reasonable thresholds and potentially impacting code size & compile time by several orders of magnitude as we've seen internally at Apple.
>
> Do you have any test case that you can share, possibly reduced and/or with sensitive code removed?  I realize it may be difficult to come up with one from a large piece of software, but I am happy to take a look at a test case that demonstrates at least one problem -- code size or compile time.

Sorry I didn't see your comment at the time I was writing. Yes I see that now, it doesn't use absolute cycles. Ultimately the problem is that this heuristic basically overrides too much.

  if (auto Result = costBenefitAnalysis()) {
    DecidedByCostBenefit = true;
    if (*Result)
      return InlineResult::success();
    else
      return InlineResult::failure("Cost over threshold.");
  }

The `InlineResult::success()` being returned here effectively is `alwaysinline`. Even if we were to continue with this notion of cycle savings (which I don't think belongs in the inliner), a particularly bad inlining cost from the inliner can be completely overruled by this heuristic. Maybe we should only use this to inline the marginally bad cases (some max % of cost > threshold).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98213/new/

https://reviews.llvm.org/D98213