[PATCH] D121862: [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions.

Mon Mar 21 16:27:04 PDT 2022

luna added a comment.

In D121862#3398140 <https://reviews.llvm.org/D121862#3398140>, @luna wrote:

> In D121862#3394290 <https://reviews.llvm.org/D121862#3394290>, @wenlei wrote:
>
>>> For my understanding, how does over-optimization or measurable code size change happen when profiles are merged and pre-link inliner skips IR transformation? Does it happen for the following scenario?
>>>
>>> - "merging without inlining" cause some functions (on the edge) to become hot functions; these hot functions are imported in post-link sample loader, and cause code size increase
>>
>> If we skip inlining at a particular call site during pre-link, but still inline that site during post-link, merging causes pre-link passes to be unnecessarily aggressive (including pre-link SCC inliner for the callee) based on merged (larger) counts because the merged counts don't reflect post-link inlining.
>
> Since `--disable-sample-loader-inlining` applies on both pre-link and post-link sample loader inlining in one clang invocation, I'm assuming the (//undesired aggressive//) inlining happens in feedback-driven inliner  (i.e. `InlinerPass::run` pointed by [1])
>
> Is my understanding correct that the current implementation [2] of `--disable-sample-loader-inlining` would (//undesirably//) inline a function in post-linking stage due to feedback-driven inliner, if this function satisfy following conditions
>
> 1. The function is hot within the module,
>   - So inline is desired in pre-linking stage.
> 2. The function is cold across module
>   - So inline is not desired in post-linking stage, but got inlined (and could be counter-productive) by feedback-driven inliner.
>
> If the scenario above is where problem happens, I'm planning to
>
> 1. Call out this undesired side-effect for the usage of `--disable-sample-loader-inlining`
> 2. Call out the option is used to evaluate AutoFDO inliner, and recommend using it together with hacks [3] in feedback-driven inliner for evaluation purpose.
> 3. Proceed with this change, as the current implementation could be confusing (in-accurate profiles that affect more than sample-loader inlining)
>
> May I get some feedback on the plans above? thanks!
>
> [1] https://github.com/llvm/llvm-project/blob/f4b794427e8037a4e952cacdfe7201e961f31a6f/llvm/lib/Transforms/IPO/Inliner.cpp#L745
> [2] the current impl is
>
> 1. skipping sample-loader inlining transformations artificially
> 2. merging profiles of inline instance to its outlining version
>
> [3] The hack is to still use static information in calculating "threshold" but ignores profiles (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/InlineCost.cpp#L1869-L1902 is where profiles are used to calculate threshold)

(An addendum to this comment)

With or without profile merging, artificially skipping inline transformations cause changes in profile and could cause unexpected change (even just for evaluation purpose)

To elaborate

1. With profile merging, functions (that are cold across modules and hot within a module) become hot
2. Without profile merging, the profiles are inaccurate (e.g., an outline function could be considered hot if profiles of inlined instance are accounted for).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121862/new/

https://reviews.llvm.org/D121862