[llvm-commits] [PATCH] Teach the inline cost analysis to forward stores of constants in the caller to loads in the callee.

Thu Dec 27 04:20:52 PST 2012

The goal of this patch is to teach the inline cost analysis to recognize
a fundamental pattern in C and C++ programs it currently misses.

The inline cost analysis tries to determine if inlining a particular callsite
will exceed the cost "threshold". This cost is parameterized on the contextual
constant folding possible at the particular callsite, which largely consists of
propagating constant call arguments through the function body and estimating
only the live code's cost.

This works reasonably well when the primary mechanism for parameterizing the
function is its formal arguments. However, often arguments are "packaged" into
a structure of some kind and that aggregate is in turn passed to the function.
This pattern is epitomized by C++'s member functions which all receive a 'this'
pointer and a large set of parameterization takes place through data members
rather than function arguments.

This patch attempts to detect these patterns by scanning from the call site up
through the preceding instruction sequence to locate stores of constants which
might be visible within the callee. Once it has collected all of the stored
constants it is able to trivially prove are live at the call site, it then
simplifies loads of these constants and propagates those simplifications when
analyzing the callee.

When analyzing libstdc++4.6's std::vector implementation on Linux given the
following program:

  void f() {
    std::vector<int> v;
    v.push_back(1);
    v.push_back(2);
    v.push_back(3);
    v.push_back(4);
  }

The push_back method is already simple to analyze and get's inlined. However,
within its code is an auxiliary helper routine to actually do the insertion.
This routine simplifies dramatically after inlining because all of its logic is
to handle things that are known trivially -- for each case we can prove the
exact size of the buffer, the location to insert, etc. However, these
invariants are not always visible when analyzing this insertion routine. After
a certain point, the analysis considers the entire insertion routine to be
live, and it has an estimated cost of '430'. With this patch, we're able to
better analyze that call, deleting one large path through the function, and
computing a cost of '290'. The inline threshold at -O3 for this routine is
'275'. The '290' isn't really accurate, this function will become about 2 basic
blocks an 10 instructions when all is said and done, and later it will be
deleted entirely. We still have more work to do here, but this is a good first
step.

Sadly, the code here is pretty heinous. I'd love ideas about how to simplify
it, or generalize it, or make it more principled. It's a very brute-force
approach currently. I've not written comprehensive tests for the functionality
here as I wanted some feedback on the general approach and structure of the
solution first. It does survive a bootstrap and so it should be viable for
folks to experiment with.

Thanks!
-Chandler

http://llvm-reviews.chandlerc.com/D246

Files:
  include/llvm/Analysis/InlineCost.h
  lib/Analysis/InlineCost.cpp
  lib/Transforms/IPO/InlineSimple.cpp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D246.1.patch
Type: text/x-patch
Size: 29894 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121227/b73e4f6b/attachment.bin>