[PATCH] D98481: [Inliner] Do not inline mutual-recursive functions to avoid exponential size growth.

Thu Oct 14 00:37:49 PDT 2021

nikic added a comment.

In D98481#3062704 <https://reviews.llvm.org/D98481#3062704>, @wenlei wrote:

> In D98481#3062526 <https://reviews.llvm.org/D98481#3062526>, @nikic wrote:
>
>> We've encountered variants of this issue as well. I think this is a rather critical problem that needs to be addressed in a simple, release-backportable way, even if that fix causes performance regressions in some cases. A slower build is still better than no build at all. We can't even bootstrap rustc on some platforms anymore due to this problem.
>>
>> The approach in this patch seems pretty sensible as a simple starting point and was previously suggested by @aeubanks in https://bugs.llvm.org/show_bug.cgi?id=45253 as well.
>
> This patch as is would be too aggressive as discussed above. There had been some discussion off patch. We converged on the proposal above to introduce `-inline-size-limit` and `-stop-inline-for-size-limit` - that doesn't change default behavior so it won't suddenly throttle performance for others, but offers users the ability to get around pathological cases when needed. Would that solve the problem you have at hand?

I don't think this is a solution to this problem, for two reasons: First, having builds finish in reasonable time should not be a feature that is hidden behind an option. It's okay to have an option to opt-in to longer (potentially pathologically longer) compile-time for more optimization, but it should not be the default. If this is solved by an inlining size limit, it should be enabled by default.

Second, while an inlining size limit does mitigate this issue somewhat, it doesn't really solve it. The core problem is that the inliner is broken in its handling of recursive calls, and a fix for that issue needs to involve either the SCC structure or the inlining history. For example, we are seeing this issue in the context of recursive destructors, and it's perfectly plausible for the same destructor to be invoked from many functions. If the inlining size limit prevents these functions from blowing up to hundreds of thousands of instructions and instead leaves them all at "only" 20k (or whatever the size limit would be), that's of course already a win and will mitigate the worst problems, but it's still not the ideal outcome and does not address the fundamental issue. The ideal outcome would be that this inlining doesn't happen and all the functions stay small. A per-function inlining size limit is still not "global enough" in its consideration.

I do want to mention that this can also help with performance (apart from the obvious benefit of not blowing up code size beyond all reasonable bounds and completely trashing the instruction cache). For example, we're currently forced to mark all destructors in landingpads as noinline, because the pathological inlining problem would be even worse otherwise. However, this is problematic because it prevents inlining of trivial destructors as well, so we miss opportunities to simplify or entirely optimize away landingpads. We would not be forced to do this if the inliner exhibited sensible behavior.

Now, I'm not saying this patch is perfect in it's current form -- it seems worthwhile to at least try to implement the additional fan-out restriction that has been suggested above. But the core idea here seems like the right direction...

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98481/new/

https://reviews.llvm.org/D98481