[llvm-dev] Proposal for O1/Og Optimization and Code Generation Pipeline
Eric Christopher via llvm-dev
llvm-dev at lists.llvm.org
Thu Mar 28 19:09:19 PDT 2019
Hi All,
I’ve been thinking about both O1 and Og optimization levels and have a
proposal for an improved O1 that I think overlaps in functionality
with our desires for Og. The design goal is to rewrite the O1
optimization and code generation pipeline to include the set of
optimizations that minimizes build and test time while retaining our
ability to debug.
This isn’t to minimize efforts around optimized debugging or negate O0
builds, but rather to provide a compromise mode that encompasses some
of the benefits of both. In effect to create a “build mode for
everyday development”.
This proposal is a first approximation guess on direction. I’ll be
exploring different options and combinations, but I think this is a
good place to start for discussion. Unless there are serious
objections to the general direction I’d like to get started so we can
explore and look at the code as it comes through review.
Optimization and Code Generation Pipeline
The optimization passes chosen fall into a few main categories,
redundancy elimination and basic optimization/abstraction elimination.
The idea is that these are going to be the optimizations that a
programmer would expect to happen without affecting debugging. This
means not eliminating redundant calls or non-redundant loads as those
could fail in different ways and locations while executing. These
optimizations will also reduce the overall amount of code going to the
code generator helping both linker input size and code generation
speed.
Dead code elimination
- Dead code elimination (ADCE, BDCE)
- Dead store elimination
- Parts of CFG Simplification
- Removing branches and dead code paths and not including commoning
and speculation
Basic Scalar Optimizations
- Constant propagation including SCCP and IPCP
- Constant merging
- Instruction Combining
- Inlining: always_inline and normal inlining passes
- Memory to register promotion
- CSE of “unobservable” operations
- Reassociation of expressions
- Global optimizations - try to fold globals to constants
Loop Optimizations
Loop optimizations have some problems around debuggability and
observability, but a suggested set of passes would include
optimizations that remove abstractions and not ones that necessarily
optimize for performance.
- Induction Variable Simplification
- LICM but not promotion
- Trivial Unswitching
- Loop rotation
- Full loop unrolling
- Loop deletion
Pass Structure
Overall pass ordering will look similar to the existing pass layout in
llvm with passes added or subtracted for O1 rather than a new pass
ordering. The motivation here is to make the overall proposal easier
to understand initially upstream while also maintaining existing pass
pipeline synergies between passes.
Instruction selection
We will use the fast instruction selector (where it exists) for three reasons:
- Significantly faster code generation than llvm’s dag based
instruction selection
- Better debugability than selection dag - fewer instructions moved around
- Fast instruction selection has been optimized somewhat and
shouldn’t be an outrageous penalty on most architectures
Register allocation
The fast register allocator should be used for compilation speed.
Thoughts?
Thanks!
-eric
More information about the llvm-dev
mailing list