<div class="gmail_quote">Hello all, thanks for the feedback!<div><br><div><div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It sounds like you are planning to follow the approach of Joshi, Nelson and<br>


Randall ("Denali: A Goal-directed Superoptimizer") in that you don't intend<br>

to exhaustively enumerate all possible code sequences, and see if they are<br>

the same as the original only better; but instead start from the original code<br>

sequence and repeatedly apply transformations that change code to equivalent<br>

code, looking to see if you can eventually get something better than the<br>

original.  Is that right?<br>

<br></blockquote><div><br></div></div><div>Exactly. If we can use axioms/transformations/identities which don't modify the code behavior, but just tries to interpret instruction semantics in a different way, so as to find another (and hopefully simpler) instructions to do the same job, the role of the SAT prover is no longer necessary, since the search space is constrained to expressions which are semantically equivalent.</div>


<div><br></div><div>Example:</div><div>Transformation:  (ADD  Operand_A, Contant 0)  <- equivalent to-> Operand_A</div><div><br></div><div>Using this transformation, the search algorithm now can simplify "add A with 0" to "A" or, what may look initially not so useful, transform any operand A into an add with A plus 0. Later, this is important to prove we can use an add instruction to load any operand. Well, I think you got the idea of using a set of axioms... this, in practice, is important to prove an arbitrary sequence A is equivalent to SequenceB, and, if B is shorter than A, use B instead.  Speaking of implementation, there will be a big list of transformations and a "for" loop trying each transformation and recursively applying more transformations, until we can show a code sequence is equivalent to another. This is a short overview of the algorithm and how I worked in this area. If you prefer another approach described formally in another paper, I can study it and use another algorithm.</div>


<div><br></div><div>Note that this system is a simple theorem prover, since it applies several equivalence rules to an expression to try to transform it to the answer. The difference is that we don't have a "answer" to prove. Any faster sequence will suffice as "answer". Regarding the performance "shift vs. multiply", the instructions will have weights. Since we are working at the IR level, we can't assume neither is good. That's why superoptimization is best applied as peephole optimization of a backend prior to code emission - when we know more about the architecture, we can take more appropriate decisions and, with superoptimization, use obscure target machine instructions to compute apparently unrelated expressions. If we use this at the IR level, some suggested optimizations may not necessarily lead to better code after the backend pass (lowering/transformation to selectionDAG). That's the downside of using it at the IR level. On the other side, many other simple substitutions may be learned by the algorithm and improve the instructionsimplify.</div>


<div><br></div><div>Thanks,</div><div><br></div><font color="#888888"><div>-Rafael</div></font></div></div></div>

</div><br>