[LLVMdev] Theoretical grounds of IR.

Mike Stump mrs at apple.com
Fri Aug 15 10:11:13 PDT 2008


On Aug 15, 2008, at 8:34 AM, Mahadevan R wrote:
> Given a specific target, and a set of rules that define what is  
> optimal code for that target,

One doesn't  need the notion of what is optimal in order to achieve  
this.

> can it be proven that the LLVM IR provides the necessary and  
> sufficient abstractions from which optimal code can be generated?

By the phrase optimal code `can' be generated, I'll assume you mean,  
you can add an oracle pass that `just knows' what optimal code is to  
llvm.

Sure, as long as the IR includes a .byte escape, the proof is  
trivial.  llvm includes such a construct, so trivially, it can be used  
to formulate optimal code.  I'd call this the degenerate case, and I  
realize you're probably not interested in this case, but, there it  
is.  Beyond that, in the general case, I'd expect the answer is no.   
If you want to limit the question to a specific machine, with certain  
limitations, one can engineer the answer to be yes.  From a theoretic  
perspective, I'd call that cheating.  But, if your willing to limit  
things, one just needs to show a 1 to 1 mapping of constructs in the  
IR to instructions and show that every instruction has such a mapping.

Now, you ask, what would such limitations looks like.  Turn the CS  
page back 20 years and check out self modifying code and laying out  
code to encode meaning into the addresses used by that code for  
examples of those things that most modern compilers take no advantage  
of.   llvm cannot in general encode those sorts of details in its IR  
and reason about them, and I suspect a proof showing the existence of  
such a machine requiring such things for optimal code would be trivial  
enough.

> If I "raise" optimized assembly language code into LLVM IR, is it
> possible (thoeretically, provably) to generate equal, or more, optimal
> assembly code from it?

Trivially, yes.  You generate .byte encodings for the entire thing,  
and upon output, you'd get the same answer back, so it would have  
equal or more optimal assembly code.  Proving you can get more optimal  
code out of it, in the general case, is trivially impossible, just  
consider the case when the optimal code is given as input.



More information about the llvm-dev mailing list