[all-commits] [llvm/llvm-project] a41900: [CSSPGO][Preinliner] Use linear threshold to drive...
Hongtao Yu via All-commits
all-commits at lists.llvm.org
Sun May 8 22:23:47 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: a4190037fac06c2b0cc71b8bb90de9e6b570ebb5
https://github.com/llvm/llvm-project/commit/a4190037fac06c2b0cc71b8bb90de9e6b570ebb5
Author: Hongtao Yu <hoy at fb.com>
Date: 2022-05-08 (Sun, 08 May 2022)
Changed paths:
M llvm/tools/llvm-profgen/CSPreInliner.cpp
M llvm/tools/llvm-profgen/CSPreInliner.h
M llvm/tools/llvm-profgen/ProfileGenerator.cpp
M llvm/tools/llvm-profgen/ProfileGenerator.h
Log Message:
-----------
[CSSPGO][Preinliner] Use linear threshold to drive inline decision.
The per-callsite size threshold used today to drive preinline decision is based on hotness/coldness cutoff. The default setup is for callsites with a sample count above the hotness cutoff (99%), a 1500 size threshold is used. Any callsite below 99.99% coldness cutoff uses a zero threshold. This has a couple issues:
1. While both cutoffs and size thoresholds are configurable, different applications may need different setups, making a universal setup impractical.
2. The callsites between hotness cutoff and coldness cutoff are not considered as inline candidates, which could be a missing opportunity.
3. Hot callsites always use the same threshold. In reality we may want a bigger threshold for hotter callsites.
In this change we are introducing a linear threshold regardless of hot/cold cutoffs. Given a sample space, a threshold is computed for a callsite based on the position of that callsite sample in the whole space. With that we no longer need to define what's hot or cold. Callsites with different hotness will get a different threshold. This should overcome the above three issues.
I have seen good results with a universal default setup for two of our internal services.
For one service, 0.2% to 0.5% perf improvement over a baseline with a previous default setup, on-par code size.
For the second service, 0.5% to 0.8% perf improvement over a baseline with a previous default setup, 0.2% code size increase; on-par performance and code size with a baseline that is with a carefully tuned cutoff to cover enough hot functions.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D125023
More information about the All-commits
mailing list