[LLVMdev] GSoC 2011: Superoptimization for LLVM IR

Thu Apr 7 15:03:20 PDT 2011

Hello all, thanks for the feedback!

It sounds like you are planning to follow the approach of Joshi, Nelson and
> Randall ("Denali: A Goal-directed Superoptimizer") in that you don't intend
> to exhaustively enumerate all possible code sequences, and see if they are
> the same as the original only better; but instead start from the original
> code
> sequence and repeatedly apply transformations that change code to
> equivalent
> code, looking to see if you can eventually get something better than the
> original.  Is that right?
>
>
Exactly. If we can use axioms/transformations/identities which don't modify
the code behavior, but just tries to interpret instruction semantics in a
different way, so as to find another (and hopefully simpler) instructions to
do the same job, the role of the SAT prover is no longer necessary, since
the search space is constrained to expressions which are semantically
equivalent.

Example:
Transformation:  (ADD  Operand_A, Contant 0)  <- equivalent to-> Operand_A

Using this transformation, the search algorithm now can simplify "add A with
0" to "A" or, what may look initially not so useful, transform any operand A
into an add with A plus 0. Later, this is important to prove we can use an
add instruction to load any operand. Well, I think you got the idea of using
a set of axioms... this, in practice, is important to prove an arbitrary
sequence A is equivalent to SequenceB, and, if B is shorter than A, use B
instead.  Speaking of implementation, there will be a big list of
transformations and a "for" loop trying each transformation and recursively
applying more transformations, until we can show a code sequence is
equivalent to another. This is a short overview of the algorithm and how I
worked in this area. If you prefer another approach described formally in
another paper, I can study it and use another algorithm.

Note that this system is a simple theorem prover, since it applies several
equivalence rules to an expression to try to transform it to the answer. The
difference is that we don't have a "answer" to prove. Any faster sequence
will suffice as "answer". Regarding the performance "shift vs. multiply",
the instructions will have weights. Since we are working at the IR level, we
can't assume neither is good. That's why superoptimization is best applied
as peephole optimization of a backend prior to code emission - when we know
more about the architecture, we can take more appropriate decisions and,
with superoptimization, use obscure target machine instructions to compute
apparently unrelated expressions. If we use this at the IR level, some
suggested optimizations may not necessarily lead to better code after the
backend pass (lowering/transformation to selectionDAG). That's the downside
of using it at the IR level. On the other side, many other simple
substitutions may be learned by the algorithm and improve the
instructionsimplify.

Thanks,

-Rafael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110407/ce89f157/attachment.html>