[llvm-commits] PATCH: First step in refactoring the threshold used in InlineCost computation

Thu Mar 15 02:08:03 PDT 2012

Hello,

This is a generic refactoring patch that shouldn't change any
functionality. Just sinking the threshold into the inline cost analysis,
and cleaning up the API of the InlineCost objects so that they can wrap up
both the cost and the threshold.

Below I've got a bit of background about what I'm working on, in case folks
are curious what's motivating this refactoring, and where I'm headed with
the inline cost stuff. Sorry for the long ramble that follows...

Duncan and I had a long discussion about how to more accurately compute the
inline cost for callsites. The particular problem I'm aiming at are
functions which look something like:

void foo(int size) {
  if (size == 1) { /* something small */ return ...; }
  if (size == 2) { /* something small */ return ...; }
  if (size == 3) { /* something small */ return ...; }
  if (size == 4) { /* something small */ return ...; }
  /* something huge for the general case, involving loops, function calls,
all kinds of madness */
  return ...;
}

Here, a few unfortunate things happen with the current inline cost system:

1) We compute a single 'weight' for the function when size is a constant,
regardless of what the constant is.
2) We compute the weight by looking at each branch on a comparison of size
w/ a constant, and subtracting the average of the two sides from the total
function cost
3) We compute this weight even if, for example, the very first basic block
is too large to ever inline.
4) We use ad-hoc folding logic to determine exactly what happens with the
constant because we don't have an *actual* constant.

I think we have a good idea of how to address these issues. It's a bit high
risk, but fortunately all but the last steps seem strict improvements to
the world anyways.

The general idea is to switch to a per-call-site analysis of the inline
cost, computing significantly less per-function ahead of time. Under this
model, we can walk the potentially-live basic blocks in CFG order,
propagating the actual constant arguments of the particular callsite
through the function. When a branch is proven through this propagation to
not be taken, we won't even look at it to compute the cost. The result is
that the computed cost at each callsite will reflect the exact maximum code
paths left after inlining. Even if the fuction has *wildly* divergent costs
on two different code paths, they will be properly accounted. This will
both inline more often when the code path that results is short, and less
often when the code path that results is very large.

Now the problem with this approach in general is that it is *expensive*. It
scales very badly as described. We have some good ideas about how to
carefully limit the cost though. The core of the cost limiting is to have
the threshold available *while* computing the cost. The moment we cross the
threshold, we can early exit without looking any farther. This means we'll
only ever walk a range of the function proportional to the range we're
willing to inline and optimize at that callsite anyways. Next up, we can
memoize the results of the analysis per-callsite so we don't ever
re-compute the same cost metric twice. Finally, we can build up helper
tables about the function ahead of time that essentially allow the analysis
to work on a per-basic-block granularity, and per-*folded*-instruction.

Anyways, I'm rather optimistic that we can make the per-callsite analysis
sufficiently fast, and sufficiently well cached that it will be tractable
in terms of compile time, and the benefit to accuracy of cost estimation is
*huge* for code patterns that tend to show up in hot parts of the code
base, such as hybrid generic algorithms. It also has the potentially to cut
off some overly eager inlining due to misbehaved bonuses when in fact the
giant slow path is the one which will be selected.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120315/f7fee4de/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inline-cost-refactor1.diff
Type: text/x-patch
Size: 16024 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120315/f7fee4de/attachment.bin>