[LLVMdev] Controlling the LTO optimization level

Wed Mar 18 16:27:30 PDT 2015

Hi all,

I wanted to start a thread to discuss ways to control the optimization
level when using LTO. We have found that there are use cases for the LTO
mechanism beyond whole-program optimization, in which full optimization
is not always needed or desired. We started that discussion over in
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266560.html
and I thought I'd summarize the problem and possible solutions here:

Problem
-------

As currently implemented, the control flow integrity checks in Clang rely on
a so-called bit set lowering pass to implement its checks efficiently. The
current implementation of the bit set lowering pass requires whole-program
visibility. The full details of why are described in the design document at:
http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html

We currently achieve whole-program visibility using LTO. The trouble with LTO
is that it comes with a significant compile time cost -- on large programs
such as Chrome, compiling with link-time optimization can be over 7x slower
(over 3 hours has been measured) than compiling without.

So I would like there to be a way for users to choose whether to apply
optimizations, and how much optimization to apply.

Achieving this requires a design for how users should specify the level of
optimization to apply, as well as a design for changes to the clang driver
and the various LTO plugins so that the plugin knows whether optimizations
are required.

Solutions
---------

1) Controlled at compile time

Strawman proposal for command line syntax:

-flto-level=X means optimize at level X. At link time, the LTO plugin will
take the maximum of all -flto-level flags and optimize at that level.

-flto-level is inferred from other flags if not specified:

-flto implies -flto-level=2.
If -flto not specified, -O >= 1 implies -flto-level=1.
Otherwise, default to -flto-level=0.

This is probably easier to implement in a supported way. We can pass the
LTO level to the linker via module flags as shown in the patches attached to
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266778.html

2) Controlled at link time

-flto-level has the same semantics as in the previous sub-section, except it is
instead passed at link time.

This is to a certain extent possible to implement with libLTO by passing
-mllvm flags to the linker, or with gold by passing -plugin-opt flags.

According to Duncan, passing flags to libLTO this way is unsupported --
if we did want to accept flags at link time, and we absolutely don't want
to pass flags to the linker that way, I suppose we could do something like
have the clang driver synthesize a module containing the module flags we want.

Optimization Levels
-------------------

We need to decide what the various optimization levels mean. The thing that
works best for the CFI use case is for -flto-level=2 to mean what -flto
currently means, for -flto-level=1 to mean "run only the globaldce and
simplifycfg passes", and for -flto-level=0 to mean "run no passes", but this
may not be the correct thing to do in every situation where we only want a
few passes to run at link time. We may want to make -flto-level a cc1-level
flag until we've had more experience and found more use cases.

Thanks,
-- 
Peter