[LLVMdev] whole program optimization examples?

Sat Oct 11 17:15:30 PDT 2014

On 10/10/2014 06:24 PM, Hayden Livingston wrote:
> Hello,
>
> I was wondering if there is an example list somewhere of whole program 
> optimizations done by LLVM based compilers?
>
> I'm only familiar with method-level optimizations, and I'm being told 
> wpo can deliver many great speedups.
>
> My language is currently staticly typed JIT based and uses the JVM, 
> and I want to move it over to LLVM so that I can have options where it 
> can be ahead of time compiled as well.
Depending on your use case (and frankly, your budget), you might want to 
consider Azul Zing's ReadyNow features: 
http://www.azulsystems.com/solutions/zing/readynow

This isn't true ahead of time compilation, but it would be a way to get 
most of the benefits of classic ahead of time compilation running on a 
standards compliant JVM.

(Keep in mind, I work for Azul.  I may be slightly biased here.)
>
> I'm hearing bad things about LLVM's JIT capabilities -- specifically 
> that writing your own GC is going to be a pain.
Out of curiosity, where did you hear this?

We are actively working on improving the state of the world here. I'd 
suggest you take a look at the infrastructure patches currently up for 
review here: http://reviews.llvm.org/D5683

These will hopefully land within a week or two.  At that point, the "gc 
infrastructure" part should be functional.  You'd have to pick a GC 
(LLVM does not provide one), but you're frontend could emit barriers and 
statepoints (gc parseable callsites) and everything should work.  (Well, 
modulo bugs!  Which I want to know about so we can fix.)

There are a couple of options out there for pluggable GC libraries. The 
best well known is Boehm's conservative GC, but there are others.

Once that's in, we're planning on landing all of the late safepoint 
insertion logic we've been working on.  This will enable full 
optimization of code for garbage collected languages - provided you meet 
a few requirements on the input IR.  You can read about it here:
http://www.philipreames.com/Blog/tag/late-safepoint-placement/

And find the (slightly out of date) code here:
https://github.com/AzulSystems/llvm-late-safepoint-placement
>
> Anyways, sort of diverged there, but still looking for WPO examples!
I'm curious to hear others take here as well.  A few things that jump 
out at me: cross function escape analysis, alias analysis (in support of 
things like LICM), and cross function constant propagation.  Not all of 
these work out of the box, but with work (sometimes on your side, 
sometimes an LLVM patch), interesting results can be had.

Fair warning, while getting an LLVM based JIT up and running at peak 
performance is a worthwhile endeavor (IMHO), it's also a fair amount of 
work.  Getting something functional is relatively straight forward, but 
there's a lot of non-trivial tuning of your generated IR to really 
exploit the power of the optimizers well.   We've talking person years 
of work here.  Most of this is in the performance tuning phase, and 
depending on your point of comparison, it may be an easier or harder 
problem.  Essentially, the closer to C performance your current runtime 
is, the harder you'll have to work.  Getting 1/10 of C performance with 
an untuned LLVM based JIT is pretty easy; the closer you get to C (or 
JVM) performance the harder it gets.

(Disclaimer: This is me speaking off the top of my head.  Take 
everything I just said with a grain of salt.)

Philip