[PATCH] D12199: Add framework for iterative compilation to llvm

Zoran Jovanovic via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 20 10:37:22 PDT 2015


zoran.jovanovic added a comment.

This is a follow-up work on the iterative compilation framework for clang/llvm.
Initial discussion as well as the introduction to this approach has been presented at:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/thread.html#68784

Previous patch revisions (same as suggestions to create new revision for this patch) can be found at:

http://reviews.llvm.org/D4723

Here is a short explanation of the forming of the decision tree.

First, some passes need to be modified to call getDecision function.
If this function returns false, a pass should do exactly as the code without modification.
If the return value is true, then the alternative path is taken.

Function getDecison always returns false in the first iteration.
And the compiler should work exactly as if there was no iterrative framework.

Let us assume that getDecision function is called four times in the first iteration of iterative compilation.
After the end of the first iteration, the decision tree will look like this:

  o
  \
    o
      \
        o
          \
            o
              \
                o

In the second iteration, exactly one decision is changed and the compiler works as in the first iteration until it reaches this particular decision. Then alternative path is taken in this point.
Further, getDecision function returns false until compilation iteration is finished. Let us assume that we took alternative decision at the third decision and that there are three more calls
of getDecision function. Note that different compilation paths may have different number of calls of getDecision function.

  o
  \
    o
      \
        o
      /   \
     o      o
       \      \
         o      o
           \
             o

Every new iteration adds one new branch to the decision tree.

At the end of each iteration fitness of the generated code is evaluated. In the last iteration, the compiler takes the path with the best fitness.

To augment the selection of a node where alternative decision would be taken, getDecision tree takes one parameter.
This parameter is interpreted as priority and the node with highest priory is selected for next alternative decision.

Machine learning approach

Formed decision tree can be used to train a binary classifier which can be used to facilitate existing heuristics. By collecting nodes where branching of decision tree occurred, we have training examples for the classifier. We could replace existing heuristics with this trained classifier and potentially get better code even without iterative approach.

This may be a good approach for jit compiler but it requires adding this machine learning approach which is planed for the future.

N-ary decisions are not planed for now because every n-ary decision can be replaced with n-1 binary decision. For example if we have to decide for a number from zero to three we can set three yes/no questions:

  (number is zero or greater then zero?)
    /                            \
  0             (number is one or greater then one?)
                  /                      \
                 1               (number is 2 or 3?)
                                    /          \
                                   2            3

If some decisions are made relevant by some future decisions this is only some inefficiency of the compilation process but they stay in the decision tree because nodes are never deleted. The nodes represent some decision points in the history, and path from the root of the tree to the leaf has enough information to exactly replay compilation iteration.

This patch includes:

Code refactoring.

Rebase
This version of patch is rebased to current trunk.

Decision points in pre-codegen and in codegen phase
This version introduces support for decision points in both
pre-codegen and codegen phase.

This feature led to a lot changes in code becuse implementation of the
ImmutablePass is such that an ImmutablePass is destroyed between
pre-codegen and codegen phase. The issue is resolved with temporary
JSON files which pass information between those two phases. Also, we
had to merge results from these phases.

We find that such behaviour of ImmutablePass and boundary between
pre-codegen and codegen phases is a bit strange and that there is
space for improvements which would simplify this patch and
implementation of similar features in llvm.

New optimization point - RegAllocGreedy
Decision points in codegen phase enabled us to implement a new
optimization point where iterative compilation can take an alternative
path. In this version of patch iterative compilation can take
alternative paths in register allocation, next to LICM and Inliner
which were present in the old version, too.

Added unit tests for DecissionTree, FileUtils and ModuleDecisionTreeProxies.

Bugfixes
During development we fixed some bugs. Some of them were introduced by
new changes and some were existing even in the old version.

Testing procedure and current status
We are targeting MIPS architecture and current testing procedure looks
like this:

create CSiBE test configuration
preprocess files with gcc
run all tests with -Oz -target mipsel-unknown-linux
-fiterative-comp=n -S

assemble output with MIPS gcc toolchain
measure results with size
Our results show that there are improvements of up to 8.78% in code
size which is promising and we are working on getting such results in
more cases and getting even better results.

Our results also show that making alternative decisions in compilation
path in LICM, Inliner or RegAlloc only (separated) can lead to
improvements in code size up to, respectively, 6%, 5% and 8%. This
results proved our assumption that these optimization points are good
for making decisions to take alternative compilation path is true.

Significant improvements are seen as early as with 10 iterations, but
100 iterations lead to even better results. As expected, improvements
and iteration count are not linearly dependant.

RFC
We would like to see this feature (iterative compilation) merged into
the LLVM/Clang core and we would like to hear what we should change
and improve so this will be possible in the future.

We would also like to hear from you if you find this approach
interesting and get any suggestion where to steer research and
experiments. Any suggestions which passes have potential for taking
alternative path?

Currently, there are few issues that we are working among which the
most interesting are:

Register allocator has higher priority than other optimization
points so we will change implementation in that way that all
optimization points can get a chance to take alternative direction

Our decision whether to take an alternative path or not is based on
a fitness function which can be improved. We will be working on
that, too.

Please, feel free to experiment with this patch and to contact us for
any help and information.


http://reviews.llvm.org/D12199





More information about the llvm-commits mailing list