[LLVMdev] HLVM performance and shadow stack overheads

Mon Mar 30 22:35:36 PDT 2009

The (new) HLVM project is continuing to improve and I have graphed and 
analysed some performance-related data. Beating OCaml on numerical 
performance using LLVM turned out to be quite easy on x86:

http://flyingfrogblog.blogspot.com/2009/03/performance-ocaml-vs-hlvm-beta-04.html

This was achieved using a single optimization pass in HLVM (unrolling) and 
none of LLVM's own IR optimization passes. So the performance is essentially 
due to LLVM's excellent x86 code gen and sane IR generation by HLVM itself.

Also, many people have criticized LLVM's support for garbage collectors and 
were quick to dismiss the simple shadow stack approach that I have used with 
HLVM. So I thought it would be interesting to quantify the overheads 
involved:

http://flyingfrogblog.blogspot.com/2009/03/current-shadow-stack-overheads-in-hlvm.html

These results show that even a completely naive shadow stack and GC 
implementation like the one currently in HLVM has quite reasonable 
performance. In particular, suitable tweaking allows HLVM to come well within 
2x the performance of OCaml on the list-based 10-queens benchmark. This is 
really remarkable given that OCaml is one of the most highly optimized 
single-threaded run-times in existence.

In the future, I intend to focus on optimizations that relieve GC stress 
rather than on optimizing the GC itself. I also intend to add support for 
parallelism which, although simple in design, should make multicores far more 
useful for ML programmers.

Many thanks,
-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e