[LLVMdev] Garbage collection

Jon Harrop jon at ffconsultancy.com
Fri Feb 27 16:58:19 PST 2009


On Friday 27 February 2009 18:42:13 Gordon Henriksen wrote:
> I agree this could be better. I think it would be prudent of you,
> being aware of this problem, to structure your compiler so as to limit
> the number of pieces of code which would be effected when you switch
> to a copying collector.

I think that would make my VM a lot more complicated for no clear practical 
gain.

> I order to address it thoroughly, the LLVM GC infrastructure needs to
> track register roots so that it doesn't need to conservatively reload
> from the stack so frequently.

F# suffers because null references are typeless on .NET but you want to use 
typed null references to represent the empty list and the None option type 
constructor and many other things. The F# team cannot fix their VM so they 
were forced to settle on an inefficient implementation of lists in a language 
where lists are ubiquitous and an implementation of the option type that is 
efficient but does not pretty print correctly due to insufficient type 
information.

I can and did fix my VM by representing reference types as an aggregate 
containing a pointer to shared data (the type) and a pointer to individual 
data (the value). Thus you can have a typed null pointer by having a pointer 
to a valid type and a null pointer for the individual data.

However, that means my roots are now aggregate values. So your proposed GC 
infrastructure for LLVM would need to be able to mark aggregate values as 
roots as well as individual values. Moreover, even if we went through all 
this work of improving LLVM to support this GC infrastructure and altering my 
VM to traverse the system stack instead of my current shadow stack, I am not 
even convinced it would be a significant improvement over what I already 
have.

> This would likely entail a change to the 
> IR (either a 'GC' address space as Chris suggests, a new intrinsic to
> 'tag' a value as a GC pointer, or even a change to the Type
> hierarchy)

Can you elaborate on a changing of the type hierarchy?

> --another reason to isolate GC-handling code in your compiler. 

I just cannot see how the GC-handling code can be isolated in such a way that 
the result is better than just having separate HLVMs for separate design 
decisions.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e



More information about the llvm-dev mailing list