[LLVMdev] Postponing more passes in LTO

Mon Dec 15 11:27:18 PST 2014

I have done some preliminary investigation into postponing some of the
passes to see what the resulting performance impact would be. This is a
fairly crude attempt at moving passes around to see if there is any
potential benefit. I have attached the patch I used to do the tests, in case
anyone is interested. 

Briefly, the patch allows two different flows, with either a flag of
-lto-new or -lto-new2. In the first case, the vectorization passes are
postponed from the end of populateModulePassManager() function to midway
through the addLTOOptimizationPasses(). In the second case, essentially the
entire populateModulePassManager() function is deferred until link time.

I ran spec2000/2006 on an ARM platform (Nexus 4), comparing 4 configurations
(O3, O3 LTO, O3 LTO new, O3 LTO new 2). I have attached a PDF presenting the
results from the test. The first 4 columns have the spec result (ratio) for
the 4 different configurations. The second set of columns are the respective
test / max(result of 4 configurations). I used this last one to see how
well/poor a particular configuration was in comparison to other
configurations. 

In general, there appears to be some benefit to be gained in a couple of the
benchmarks (spec2000/art, spec2006/milc) by postponing vectorization. 

I just wanted to present this information to the community to see if there
is interest in pursuing the idea of postponing passes.

Daniel

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Daniel Stewart
Sent: Wednesday, September 17, 2014 9:46 AM
To: llvmdev at cs.uiuc.edu
Subject: [LLVMdev] Postponing more passes in LTO

Looking at the existing flow of passes for LTO, it appears that most all
passes are run on a per file basis, before the call to the gold linker. I'm
looking to get people's feedback on whether there would be an advantage to
waiting to run a number of these passes until the linking stage. For
example, I believe I saw a post a little while back about postponing
vectorization until the linking stage. It seems to me that there could be an
advantage to postponing (some) passes until the linking stage, where the
entire graph is available. In general, what do people think about the idea
of a different flow of LTO where more passes are postponed until the linking
stage? 

Daniel Stewart

--

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141215/354f7b70/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: newflow.patch
Type: application/octet-stream
Size: 6713 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141215/354f7b70/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Community LLVM on Nexus4.pdf
Type: application/pdf
Size: 123260 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141215/354f7b70/attachment.pdf>