[LLVMdev] GSoC 2011: Superoptimization for LLVM IR

Thu Apr 7 23:34:43 PDT 2011

Hi Haohui,

> IMO super optimizer would yield less benefits on LLVM compared to
> other compilers.
>
> If you check the patch of the instcombine pass, you'll find out people
> keep dragging "correct" optimization out, not because the optimization
> violates the semantic of LLVM IR, but it will generate wrong code
> sequences when lowering to machine code.

it sounds like you have had some bad experiences.  In my opinion it is
completely correct to disable instcombine optimizations that expose a
code generator bug, but only until the code generators are fixed.  It
should not be an excuse to never fix the code generators.  The usual
strategy is to open a bug report about the codegen problem so that it
is not forgotten.

> An example:
>
> %3 = fcmp %1, %2
> %6 = fcmp %4, %5
> %7 = and %3, %6
> %8 = and %7, %foo
>
> Sometimes you'll be screw if you want to play reassociate %7 and %8. I
> don't see a easy way of catching them in theorem provers.

I've already implemented a lot of simplifications found by the existing
super optimizer, and none of them were reverted, so I think your concern
is exaggerated.

Ciao, Duncan.

>
> Haohui
>
> On Thu, Apr 7, 2011 at 5:03 PM, Rafael Auler<rafaelauler at gmail.com>  wrote:
>> Hello all, thanks for the feedback!
>>>
>>> It sounds like you are planning to follow the approach of Joshi, Nelson
>>> and
>>> Randall ("Denali: A Goal-directed Superoptimizer") in that you don't
>>> intend
>>> to exhaustively enumerate all possible code sequences, and see if they are
>>> the same as the original only better; but instead start from the original
>>> code
>>> sequence and repeatedly apply transformations that change code to
>>> equivalent
>>> code, looking to see if you can eventually get something better than the
>>> original.  Is that right?
>>>
>>
>> Exactly. If we can use axioms/transformations/identities which don't modify
>> the code behavior, but just tries to interpret instruction semantics in a
>> different way, so as to find another (and hopefully simpler) instructions to
>> do the same job, the role of the SAT prover is no longer necessary, since
>> the search space is constrained to expressions which are semantically
>> equivalent.
>> Example:
>> Transformation:  (ADD  Operand_A, Contant 0)<- equivalent to->  Operand_A
>> Using this transformation, the search algorithm now can simplify "add A with
>> 0" to "A" or, what may look initially not so useful, transform any operand A
>> into an add with A plus 0. Later, this is important to prove we can use an
>> add instruction to load any operand. Well, I think you got the idea of using
>> a set of axioms... this, in practice, is important to prove an arbitrary
>> sequence A is equivalent to SequenceB, and, if B is shorter than A, use B
>> instead.  Speaking of implementation, there will be a big list of
>> transformations and a "for" loop trying each transformation and recursively
>> applying more transformations, until we can show a code sequence is
>> equivalent to another. This is a short overview of the algorithm and how I
>> worked in this area. If you prefer another approach described formally in
>> another paper, I can study it and use another algorithm.
>> Note that this system is a simple theorem prover, since it applies several
>> equivalence rules to an expression to try to transform it to the answer. The
>> difference is that we don't have a "answer" to prove. Any faster sequence
>> will suffice as "answer". Regarding the performance "shift vs. multiply",
>> the instructions will have weights. Since we are working at the IR level, we
>> can't assume neither is good. That's why superoptimization is best applied
>> as peephole optimization of a backend prior to code emission - when we know
>> more about the architecture, we can take more appropriate decisions and,
>> with superoptimization, use obscure target machine instructions to compute
>> apparently unrelated expressions. If we use this at the IR level, some
>> suggested optimizations may not necessarily lead to better code after the
>> backend pass (lowering/transformation to selectionDAG). That's the downside
>> of using it at the IR level. On the other side, many other simple
>> substitutions may be learned by the algorithm and improve the
>> instructionsimplify.
>> Thanks,
>> -Rafael
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev