[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
Andrey Bokhanko via llvm-dev
llvm-dev at lists.llvm.org
Sat Jul 22 15:23:23 PDT 2017
Hi River,
Very impressive! -- thanks for working on this.
A few questions, if you don't mind.
First, on results (from goo.gl/5k6wsP). Some of them are quite surprising.
In theory, "top improvements" should be quite similar in all three
approaches ("Early&Late Outlining", "Late Outlining" and "Machine
Outliner"), with E&LO capturing most of the cases. Yet, they are very
different:
Test Suite, top improvements:
E&LO:
-
enc-3des: 66.31%
-
StatementReordering-dbl: 51.45%
-
Symbolics-dbl: 51.42%
-
Recurrences-dbl: 51.38%
-
Packing-dbl: 51.33%
LO:
-
enc-3des: 50.7%
-
ecbdes: 46.27%
-
security-rjindael:45.13%
-
ControlFlow-flt: 25.79%
-
ControlFlow-dbl: 25.74%
MO:
-
ecbdes: 28.22%
-
Expansion-flt: 22.56%
-
Recurrences-flt: 22.19%
-
StatementReordering-flt: 22.15%
-
Searching-flt: 21.96%
SPEC, top improvements:
E&LO:
-
bzip2: 9.15%
-
gcc: 4.03%
-
sphinx3: 3.8%
-
H264ref: 3.24%
-
Perlbench: 3%
LO:
-
bzip2: 7.27%
-
sphinx3: 3.65%
-
Namd: 3.08%
-
Gcc: 3.06%
-
H264ref: 3.05%
MO:
-
Namd: 7.8%
-
bzip2: 7.27%
-
libquantum: 2.99%
-
h264ref: 2%
Do you understand why so?
I'm especially interested in cases where MO managed to find redundancies
while E&O+LO didn't. For example, 2.99% on libquantum (or is it simply
below "top 5 results" for E&O+LO?) -- did you investigated this?
Also, it would be nice to specify full options list for SPEC (I assume SPEC
CPU2006?), similar to how results are reported on spec.org.
And a few questions on the RFC:
On Fri, Jul 21, 2017 at 12:47 AM, River Riddle via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> * Debug Info:
>
Debug information is preserved for the calls to functions which have been
> outlined but all debug info from the original outlined portions is removed,
> making them harder to debug.
>
Just to check I understand it correctly: you remove *all* debug info in
outlined functions, essentially making them undebuggable -- correct? Did
you considered copying debug info from one of outlined fragments instead?
-- at least line numbers?
The execution time results are to be expected given that the outliner,
> without profile data, will extract from whatever region it deems
> profitable. Extracting from the hot path can lead to a noticeable
> performance regression on any platform, which can be somewhat avoided by
> providing profile data during outlining.
>
Some of regressions are quite severe. It would be interesting to implement
what you stated above and measure -- both code size reductions and
performance degradations -- again.
> * LTO:
>
> - LTO doesn’t have a code size pipeline, but %reductions over LTO are
> comparable to non LTO.
>
LTO is known to affect code size significantly (for example, by removing
redundant functions), so I'm frankly quite surprised that the results are
the same...
Yours,
Andrey
===
Compiler Architect
NXP
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170723/96fe9492/attachment.html>
More information about the llvm-dev
mailing list