[llvm-dev] [RFC] Enable Partial Inliner by default

Thu Nov 2 14:26:58 PDT 2017

Hello,

I'd like to propose turning on the partial inliner
(-enable-partial-inlining) by default.

We've seen small gains on SPEC2006/2017 runtimes as well as lnt
compile-times with a 2nd stage bootstrap of LLVM.  We also saw positive
gains on our internal workloads.

-------------------------------------
Brief description of Partial Inlining
-------------------------------------
A pass in opt that runs after the normal inlining pass.  Looks for branches
to a return block in the entry and immediate successor blocks of a
function.  If found, it outlines the rest of the function using the
CodeExtractor.  It then attempts to inline the leftover entry block (and
possibly one or more of its successors) to all its callers.  This
effectively peels the early return block(s) into the caller, which could be
executed without incurring the call overhead of the function just to return
immediately.  Inlining and call overhead cost, as well as branch
probabilities of the return block(s) are taken into account before inlining
is done.  If inlining is not successful, then the changes are discarded.

eg.

void foo() {
  bar();
  // rest of the code in foo
}

void bar() {
  if (X)
    return;
  // rest of code (to be outlined)
}

After Partial Inlining:

void foo() {
  if (!X)
    bar.outlined();
  // rest of the code in foo
}

void bar.outlined() {
  // rest of the code in bar
}

Here are the numbers on a Power8 PPCLE running Ubuntu 15.04 in ST-mode

----------------------------------------------
Runtime performance (speed)
----------------------------------------------
Workload		Improvement
--------		-----------
SPEC2006(C/C++)	0.06%		(geomean)
SPEC2017(C/C++)	0.10%		(geomean)
----------------------------------------------
Compile time performance for Bootstrapped LLVM
----------------------------------------------
Workload		Improvement
--------		-----------
SPEC2006(C/C++)	0.41%		(cumulative)
SPEC2017(C/C++)	-0.16%	(cumulative)
lnt			0.61%		(geomean)
----------------------------------------------
Compile time performance
----------------------------------------------
Workload		Increase
--------		--------
SPEC2006(C/C++)	1.31%		(cumulative)
SPEC2017(C/C++)	0.25%		(cumulative)
----------------------------------------------
Code size
----------------------------------------------
Workload		Increase
--------		--------
SPEC2006(C/C++)	3.90%		(geomean)
SPEC2017(C/C++)	1.05%		(geomean)

NOTE1: Code size increase in SPEC2006 was mainly attributed to benchmark
"astar", which increased by 86%.  Removing this outlier, we get a more
reasonable increase of 0.58%.

NOTE2: There is a patch up for review on Phabricator to enhance the partial
inliner with the presence of profiling information (
https://reviews.llvm.org/D38190).

Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077      C2-707/8200/Markham
Email: gyiu at ca.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171102/8cea3c69/attachment.html>