[llvm-commits] [llvm] r153506 - /llvm/trunk/lib/Transforms/IPO/Inliner.cpp
Matt Beaumont-Gay
matthewbg at google.com
Tue Mar 27 09:55:38 PDT 2012
And the award for greatest ratio of change message to text changed goes to...
On Tue, Mar 27, 2012 at 03:48, Chandler Carruth <chandlerc at gmail.com> wrote:
> Author: chandlerc
> Date: Tue Mar 27 05:48:28 2012
> New Revision: 153506
>
> URL: http://llvm.org/viewvc/llvm-project?rev=153506&view=rev
> Log:
> Make a seemingly tiny change to the inliner and fix the generated code
> size bloat. Unfortunately, I expect this to disable the majority of the
> benefit from r152737. I'm hopeful at least that it will fix PR12345. To
> explain this requires... quite a bit of backstory I'm afraid.
>
> TL;DR: The change in r152737 actually did The Wrong Thing for
> linkonce-odr functions. This change makes it do the right thing. The
> benefits we saw were simple luck, not any actual strategy. Benchmark
> numbers after a mini-blog-post so that I've written down my thoughts on
> why all of this works and doesn't work...
>
> To understand what's going on here, you have to understand how the
> "bottom-up" inliner actually works. There are two fundamental modes to
> the inliner:
>
> 1) Standard fixed-cost bottom-up inlining. This is the mode we usually
> think about. It walks from the bottom of the CFG up to the top,
> looking at callsites, taking information about the callsite and the
> called function and computing th expected cost of inlining into that
> callsite. If the cost is under a fixed threshold, it inlines. It's
> a touch more complicated than that due to all the bonuses, weights,
> etc. Inlining the last callsite to an internal function gets higher
> weighth, etc. But essentially, this is the mode of operation.
>
> 2) Deferred bottom-up inlining (a term I just made up). This is the
> interesting mode for this patch an r152737. Initially, this works
> just like mode #1, but once we have the cost of inlining into the
> callsite, we don't just compare it with a fixed threshold. First, we
> check something else. Let's give some names to the entities at this
> point, or we'll end up hopelessly confused. We're considering
> inlining a function 'A' into its callsite within a function 'B'. We
> want to check whether 'B' has any callers, and whether it might be
> inlined into those callers. If so, we also check whether inlining 'A'
> into 'B' would block any of the opportunities for inlining 'B' into
> its callers. We take the sum of the costs of inlining 'B' into its
> callers where that inlining would be blocked by inlining 'A' into
> 'B', and if that cost is less than the cost of inlining 'A' into 'B',
> then we skip inlining 'A' into 'B'.
>
> Now, in order for #2 to make sense, we have to have some confidence that
> we will actually have the opportunity to inline 'B' into its callers
> when cheaper, *and* that we'll be able to revisit the decision and
> inline 'A' into 'B' if that ever becomes the correct tradeoff. This
> often isn't true for external functions -- we can see very few of their
> callers, and we won't be able to re-consider inlining 'A' into 'B' if
> 'B' is external when we finally see more callers of 'B'. There are two
> cases where we believe this to be true for C/C++ code: functions local
> to a translation unit, and functions with an inline definition in every
> translation unit which uses them. These are represented as internal
> linkage and linkonce-odr (resp.) in LLVM. I enabled this logic for
> linkonce-odr in r152737.
>
> Unfortunately, when I did that, I also introduced a subtle bug. There
> was an implicit assumption that the last caller of the function within
> the TU was the last caller of the function in the program. We want to
> bonus the last caller of the function in the program by a huge amount
> for inlining because inlining that callsite has very little cost.
> Unfortunately, the last caller in the TU of a linkonce-odr function is
> *not* the last caller in the program, and so we don't want to apply this
> bonus. If we do, we can apply it to one callsite *per-TU*. Because of
> the way deferred inlining works, when it sees this bonus applied to one
> callsite in the TU for 'B', it decides that inlining 'B' is of the
> *utmost* importance just so we can get that final bonus. It then
> proceeds to essentially force deferred inlining regardless of the actual
> cost tradeoff.
>
> The result? PR12345: code bloat, code bloat, code bloat. Another result
> is getting *damn* lucky on a few benchmarks, and the over-inlining
> exposing critically important optimizations. I would very much like
> a list of benchmarks that regress after this change goes in, with
> bitcode before and after. This will help me greatly understand what
> opportunities the current cost analysis is missing.
>
> Initial benchmark numbers look very good. WebKit files that exhibited
> the worst of PR12345 went from growing to shrinking compared to Clang
> with r152737 reverted.
>
> - Bootstrapped Clang is 3% smaller with this change.
> - Bootstrapped Clang -O0 over a single-source-file of lib/Lex is 4%
> faster with this change.
>
> Please let me know about any other performance impact you see. Thanks to
> Nico for reporting and urging me to actually fix, Richard Smith, Duncan
> Sands, Manuel Klimek, and Benjamin Kramer for talking through the issues
> today.
>
> Modified:
> llvm/trunk/lib/Transforms/IPO/Inliner.cpp
>
> Modified: llvm/trunk/lib/Transforms/IPO/Inliner.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Inliner.cpp?rev=153506&r1=153505&r2=153506&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/IPO/Inliner.cpp (original)
> +++ llvm/trunk/lib/Transforms/IPO/Inliner.cpp Tue Mar 27 05:48:28 2012
> @@ -260,7 +260,7 @@
> int TotalSecondaryCost = 0;
> bool outerCallsFound = false;
> // This bool tracks what happens if we do NOT inline C into B.
> - bool callerWillBeRemoved = true;
> + bool callerWillBeRemoved = Caller->hasLocalLinkage();
> // This bool tracks what happens if we DO inline C into B.
> bool inliningPreventsSomeOuterInline = false;
> for (Value::use_iterator I = Caller->use_begin(), E =Caller->use_end();
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list