[LLVMdev] MSIL

Mon Dec 17 11:30:52 PST 2007

Hi Žiga,

- Gordon

On Dec 17, 2007, at 13:09, Žiga Osolin <ziga.osolin at gmail.com> wrote:

> Hi everyone!
>
> I am working on a .NET based project (actually written in C#). During
> the coding, we have noticed many inefficiencies of C# compiler to
> optimize code. Compiler performs only a few optimisations. A vital
> optimisation, inlining, is missed. The JIT-er has rules not to inline
> methods containing structs as parameters (this is really stupid!) and
> inlining methods longer than 20 bytes (another bizzare limitation).
> There is no way to change this "settings".
>
> A performance test showed that simple operation overloads on  
> structures
> work up to *5 times* slower than if they were coded manually (manual
> inlining). This is certainly not acceptable for me.
>
> I am really big fan of LLVM. I guess I could use LLVM for post-compile
> step (actually install-time step). LLVM could perform optimisations on
> my code in the following manner:
> - translate code from MSIL
> - optimize code (inlining, probably other cool optimisations)
> - translate back to MSIL
>
> Is this possible with LLVM. If not, will it ever be possible (is  
> someone
> working on that).

You probably won't see something of practical use on this front in the  
near future, unfortunately. The reason is that the "VM" in LLVM is  
considerably more low-level than the JVM and .NET VM, so operations  
and metadata that are essential to being a good citizen in these  
environments are unrecoverably erased in the .NET->LLVM translation.  
Some important considerations:

• LLVM collapses isomorphic types. Example: Any two ValueTypes with  
two int fields would be unified.
• LLVM does not have a high level concept of virtual method dispatch.  
This would have to be remedied somwhow (target-specific intrinsic?)  
before any useful code could be compiled. For example:  
Console.WriteLine("Hello world");
• All functions in LLVM are "free functions". In .NET, all functions  
must be methods of a class.
• LLVM does not have a concept of named field access. Any types used  
with LLVM code would require explicit layout.
• LLVM uses pointer arithmetic extensively. All LLVM code would need  
to be marked as "unsafe".

It would take some serious work in LLVM to adapt to your use case, and  
the benefits are debatable--the user with a serious interest in  
performance would likely be better served by selecting migrating his  
code to C++.

That said, anything is possible and patches are welcome!

-- Gordon