[LLVMdev] Garbage collection

Fri Feb 27 11:08:50 PST 2009

On Feb 26, 2009, at 12:25, Chris Lattner wrote:

> On Feb 26, 2009, at 12:02 AM, Talin wrote:
>
>> With the increasing number of LLVM-based VMs and other projects, I  
>> suspect that the desire for more comprehensive garbage collection  
>> support in LLVM is only going to increase.
>
> What you see in LLVM right now is really only the second step of the  
> planned GC evolution.  The first step was very minimal, but useful  
> for bridging to other existing collectors.  The second step was  
> Gordon's (significant!) extensions to the system which allowed him  
> to tie in the Ocaml collector and bring some more sanity to codegen.

I agree; this would be a great contribution, making LLVM much more  
accessible to the development of novel and existing languages.

> While people object to adding high level features to LLVM, high  
> level and language-specific features are *great* in llvm as long as  
> they are cleanly separable.  I would *love* to see a composable  
> collection of different GC subroutines with clean interfaces built  
> on LLVM "assembly language" GC stuff.

Absolutely.

It is definitely valuable that the existing infrastructure doesn't  
bolt LLVM to a particular runtime. With only a few days of work, PyPy  
was able to try out the LLVM GC intrinsics and static stack maps and  
saw a big performance boost from it on their LLVM back-end. (Their GCC  
backend still outperformed LLVM, but by a much smaller margin.) But  
this in no way prevents providing GC building blocks for projects that  
are not working with existing runtimes and GCs.

> As far as I know, there is nothing that prevents this from happening  
> today, we just need leadership in the area to drive it.  To avoid  
> the "ivory tower" problem, I'd strongly recommend starting with a  
> simple GC and language and get the whole thing working top to  
> bottom.  From there, the various pieces can be generalized out etc.   
> This ensures that there is always a *problem being solved* and  
> something that works and is testable.

I strongly agree with this as well.

> ps. Code generation for the GC intrinsics can be improved  
> significantly.  We can add new intrinsics that don't pin things to  
> the stack, update optimizations, and do many other things if people  
> started using the GC stuff seriously.

I've already commented on this elsewhere in the thread. Promoting GC  
roots into SSA variables from stack slots would allow much more  
freedom for the middle- and back-end optimizations, and I think is  
clearly the next logical step.

— Gordon