[llvm] [SystemZ] Remove getInliningThresholdMultiplier() override (PR #94612)
Nikita Popov via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 11 02:47:43 PDT 2024
nikic wrote:
The default inlining thresholds make a tradeoff between performance, code size and compile-time. Yes, you can increase performance numbers by increasing inlining thresholds, but you lose other things along the way.
For example libLLVM.so on x86 is 122MB large. Other architectures (like aarch64 and ppc64le) are within 5% of that. s390x instead clocks in at 185MB, that is 50% larger. It is expected that `-O2` favors performance over code size, but that does not mean that it can be ignored completely.
The reason why I ended up submitting this *now* is that I hit another s390x-only runaway inlining issue (where the build either OOMs or takes hours). It's certainly not the first time I ran into this issue.
I don't think that it's okay for the s390x target, and *only* the s390x target, to change otherwise project-global inlining tradeoffs in such a significant way. Basically, I think you need to eat the performance regressions and return to the project-global baseline.
Doing something like this would be appropriate if there is something special about the s390x architecture that justified it having a much, much higher inlining threshold than other targets. This *is* the case for GPU architectures, for which this hook exists. I could reasonably see thresholds being *slightly* different on s390x, e.g. because it has a higher function call overhead or something like that, but I find it hard to imagine what would justify a 3x threshold increase. (In reality it's actually even worse than 3x, because adjustInliningThreshold hands out additional big inlining bonuses).
https://github.com/llvm/llvm-project/pull/94612
More information about the llvm-commits
mailing list