[LLVMdev] GSoC 2011: Superoptimization for LLVM IR

Thu Apr 7 17:27:46 PDT 2011

IMO super optimizer would yield less benefits on LLVM compared to
other compilers.

If you check the patch of the instcombine pass, you'll find out people
keep dragging "correct" optimization out, not because the optimization
violates the semantic of LLVM IR, but it will generate wrong code
sequences when lowering to machine code.

An example:

%3 = fcmp %1, %2
%6 = fcmp %4, %5
%7 = and %3, %6
%8 = and %7, %foo

Sometimes you'll be screw if you want to play reassociate %7 and %8. I
don't see a easy way of catching them in theorem provers.

Haohui

On Thu, Apr 7, 2011 at 5:03 PM, Rafael Auler <rafaelauler at gmail.com> wrote:
> Hello all, thanks for the feedback!
>>
>> It sounds like you are planning to follow the approach of Joshi, Nelson
>> and
>> Randall ("Denali: A Goal-directed Superoptimizer") in that you don't
>> intend
>> to exhaustively enumerate all possible code sequences, and see if they are
>> the same as the original only better; but instead start from the original
>> code
>> sequence and repeatedly apply transformations that change code to
>> equivalent
>> code, looking to see if you can eventually get something better than the
>> original.  Is that right?
>>
>
> Exactly. If we can use axioms/transformations/identities which don't modify
> the code behavior, but just tries to interpret instruction semantics in a
> different way, so as to find another (and hopefully simpler) instructions to
> do the same job, the role of the SAT prover is no longer necessary, since
> the search space is constrained to expressions which are semantically
> equivalent.
> Example:
> Transformation:  (ADD  Operand_A, Contant 0)  <- equivalent to-> Operand_A
> Using this transformation, the search algorithm now can simplify "add A with
> 0" to "A" or, what may look initially not so useful, transform any operand A
> into an add with A plus 0. Later, this is important to prove we can use an
> add instruction to load any operand. Well, I think you got the idea of using
> a set of axioms... this, in practice, is important to prove an arbitrary
> sequence A is equivalent to SequenceB, and, if B is shorter than A, use B
> instead.  Speaking of implementation, there will be a big list of
> transformations and a "for" loop trying each transformation and recursively
> applying more transformations, until we can show a code sequence is
> equivalent to another. This is a short overview of the algorithm and how I
> worked in this area. If you prefer another approach described formally in
> another paper, I can study it and use another algorithm.
> Note that this system is a simple theorem prover, since it applies several
> equivalence rules to an expression to try to transform it to the answer. The
> difference is that we don't have a "answer" to prove. Any faster sequence
> will suffice as "answer". Regarding the performance "shift vs. multiply",
> the instructions will have weights. Since we are working at the IR level, we
> can't assume neither is good. That's why superoptimization is best applied
> as peephole optimization of a backend prior to code emission - when we know
> more about the architecture, we can take more appropriate decisions and,
> with superoptimization, use obscure target machine instructions to compute
> apparently unrelated expressions. If we use this at the IR level, some
> suggested optimizations may not necessarily lead to better code after the
> backend pass (lowering/transformation to selectionDAG). That's the downside
> of using it at the IR level. On the other side, many other simple
> substitutions may be learned by the algorithm and improve the
> instructionsimplify.
> Thanks,
> -Rafael
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>