[LLVMdev] [GSoC] Optimizing for size

Tue Apr 5 09:29:07 PDT 2011

Hi all,

I'm interested in adding code size optimizations as a GSoC project. This is
necessary when compiling for embedded devices, where LLVM should optimize for
size even at the expense of speed. I'm still working on my proposal, but I'd
like some advice on the technical parts and overall project plan.

First, I would add a way to determine which parts of the code should be optimized
for size. This is currently done with an OptimizeForSize attribute on
functions, which is added by clang with -Os. I would add a more flexible method
that determines whether a given BB should be optimized for size, depending on
profiling information (if available) and -Os and other options. I could put
this in ProfileInfoT, since it would interact closely with profiling
information.

The next step is to make optimization passes use this information. This means
updating passes that currently check the OptimizeForSize attribute to use the
new information instead. I would also make JumpThreading, InlineCost, and
possibly other passes aware of size optimization.

After working on existing passes, I would add a new MergeBlocks pass. This is
an IPO pass that would combine equivalent basic blocks, extracting them into
new functions. Research has shown that this can decrease code size by 5-6%. The
new pass will be based on CodeExtractor and MergeFunctions; it will create a
hash table of every basic block, based on the number of instructions of
different types, and then perform a detailed comparison on blocks with the same
hash. I would first write an inefficient preliminary version that only finds
obvious cases. This can be used to get an impression of the technique's
effectiveness. I would then work on making it more efficient and able to merge
blocks in more difficult cases.

Tentative schedule:
2-3 weeks: Add -Os to llc and opt. Use profiling information or -Os to
           determine which BBs should be optimized for size.
3-4 weeks: Make existing passes aware of size optimization.
2-3 weeks: Preliminary implementation of block merging.
2-3 weeks: Improve block merging algorithm speed.
1-4 weeks: Make block merging handle more difficult cases.

Any thoughts?

Thanks,
Sean Bartell
(wtachi in #llvm)