[PATCHES] A module inliner pass with a greedy call site queue

Thu Jul 31 08:04:24 PDT 2014

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Yin Ma" <yinma at codeaurora.org>
> Cc: "James Molloy" <james.molloy at arm.com>, "Nick Lewycky" <nicholas at mxc.ca>, "Hal Finkel" <hfinkel at anl.gov>,
> apazos at codeaurora.org, "Jiangning Liu" <Jiangning.Liu at arm.com>, "Commit Messages and Patches for LLVM"
> <llvm-commits at cs.uiuc.edu>
> Sent: Wednesday, July 30, 2014 5:52:19 PM
> Subject: Re: [PATCHES] A module inliner pass with a greedy call site queue
> 
> 
> 
> 
> 
> On Wed, Jul 30, 2014 at 3:46 PM, Yin Ma < yinma at codeaurora.org >
> wrote:
> 
> 
> 
> We found the current SCC
> inliner cannot inline a critical function in one of the SPEC2000
> benchmarks
> unless we increase the threshold to a very large number. Like A calls
> B
> calls C, The SCC inliner will start B -> C and inlined C into B,
> however,
> the performance gain is B to A and B is in a loop of A. However,
> after
> inlining C to B, B becomes very large so B cannot be inlined to A
> with
> default inline threshold.
> This is a well known and studied limitation of the inliner. I gave a
> talk at a prior dev meeting about it. There are specific ways to
> address it within the existing inlining framework that I've been
> working on for some time, but it requires infrastructure changes to
> implement.
> 

I realize that you've discussed this in the past, but perhaps a recap would be helpful. As I recall, the pass-manager rewrite will enable the inliner to make use of function-level analysis. Specifically, this will include making use of profiling data so that, with appropriate profiling data, we can avoid inlining cold call sites.

On a theoretical basis, I'm not sure that addresses the same issue as top-down inlining, although there is obviously some overlap in effect. Specifically, if a goal of inlining is to push the critical path length as close as possible to the target's limit for efficient dispatch, but no farther, then I fail to see how pure bottom-up inlining can ever achieve that as well as a top-down approach. That having been said, I think that bottom-up inlining seems much more likely to expose optimization opportunities.

In short, while I'm not at all convinced that Yin's heuristic is the right approach, I feel that a more in-depth discussion on the goals of inlining, and how we do, or plan to, achieve them, would be beneficial.

Thanks again,
Hal

> 
> 
> 
> I'm still very concerned that the code size impact and other impacts
> have not been widely analyzed, but only narrowly analyzed. For
> example, LLVM's inliner has historically shown over 10% code size
> advantage across a wide range of benchmark C++ applications, large
> and small. I don't think we want to sacrifice that.
> 
> 
> 
> 
> I also don't think you should underestimate the impact of inlining on
> virtual dispatch. The collapsing of indirect calls to direct calls
> is *very* important and has been carefully tuned in LLVM's current
> inliner.

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory