[llvm] [LTO][Pipelines] Add 0 hot-caller threshold for SamplePGO + FullLTO (PR #135152)

Thu Apr 10 23:16:45 PDT 2025

tianleliu wrote:

> > If a hot callsite function is not inlined in the 1st build, inlining the hot callsite in pre-link stage of SPGO 2nd build may lead to Function Sample not found in profile file in link stage. ThinLTO has already considered and dealed with it by setting HotCallSiteThreshold to 0 to stop the inline. This patch just adds the same processing for FullLTO.
> 
> Do you have some build stats (e.g., functions miss profiles before and get profiles after, etc) or binary performance data for this FullLTO patch?
> 
> Does 1st build refer to the profiled binary build here? I'm asking since the motivation for zero inline threshold in the ThinLTO prelink pipeline (before cross-module inline) is to make profile matching in the backend (after cross-module inline) more accurate within one build. I'm not sure if this zero inline threshold helps across build though.

Hi Mingming, thanks for your review.
I observed an interpreter kind benchmark could find correct profiling info after the patch (This is my motivation of the patch). For spec2017 c/c++ cases, with unchanged profile file, I could see 12 cases in total16 binaries have changed after the patch. But none of them has performance change. :(
In my understanding, setting HotCallSiteThreshold to 0 in pre-link stage would not change final inlining result, because the hot callsite functions just to be postponed to inline in link stage. Right? If yes, it probably means the 12 binary changed cases all get more accurate profiling info?
And as you commented in code, hot callsite function with below zero cost still would be inlined in pre-link stage. So I wonder that why we don't set HotCallSiteThreshold to INT_MIN to completely stop hot callsite inline in pre-link? 

https://github.com/llvm/llvm-project/pull/135152