[LLVMdev] Garbage collection
Talin
viridia at gmail.com
Thu Feb 26 00:02:00 PST 2009
One of the more interesting subjects of conversation at the 2008
developer day was related to garbage collection. With the increasing
number of LLVM-based VMs and other projects, I suspect that the desire
for more comprehensive garbage collection support in LLVM is only going
to increase. (I am now involved in two different open-source projects
both of which will eventually have a strong requirement for a
high-performance LLVM-compatible collector.)
The last time I looked, the LLVM support for GC is fairly austere,
consisting of (a) the bare minimum set of low-level intrinsics needed to
integrate a collector with a runtime environment, and (b) a couple of
example of collectors, neither of which AFAIK take full advantage of the
low-level intrinsics to get optimal performance.
The IR-level intrinsics themselves don't much help you *write* a GC, so
much as to integrate one with LLVM. What is provided is essentially a
mechanism for walking the stack, and a means to insert read/write
barriers into the generated code, which together form a tiny fraction of
what it would take to design a working collector.
Part of the reason why there isn't more direct support for GC is the
theory that there is no such thing as a one-size-fits-all collector. The
argument goes that a really efficient collector design requires detailed
knowledge of the object model of the language being compiled. Some
languages use explicit type information embedded in the object, which
has a specific offset and a specific format that is likely to be
different for different languages. Other languages have the benefit of
compile-time knowledge of the type of an entire object graph, and thus
do not require the overhead of a per-object type field.
On the other hand, it is possible to make a counter-argument to this
theory that goes like this: The Java VM has been used to implement a
large number of front-end languages efficiently, without requiring a
special garbage collector for each language. Furthermore, I've observed
that the Java VM, more than any other runtime environment, tends to be
used as a platform for academic research into GC algorithms, without
requiring changes to the set of languages which are hosted by the VM. In
other words, both the language and the GC algorithm can be varied
independently, from which we can conclude that perhaps the choice of
language and GC aren't as tightly coupled as we might think.
It might also be argued that many of the languages that people are
wanting to build with LLVM have Java-like object semantics anyway, at
least at the level of GC. (Although finalization is an issue, since this
is an area in which language semantics vary widely.)
It also seems to me that even radically different collector designs
could utilize some common building blocks for heap management, work
queuing, and so on. It might make sense to create a library of such
components as an auxilliary project to LLVM. For example, while there
are a large number of collector algorithms, there are a fairly small
number of designs for write barriers - so the library could provide
several implementations of such. Similarly, there are only so many ways
to "stop the world" (for concurrent collectors that require this), and
this could be provided as well.
Of course, there is always a danger when creating libraries of the
"ivory tower syndrome", putting a lot of effort into components that
don't actually get used. This is why it would be even better to create a
standard, high performance collector for LLVM that actually uses these
methods.
As I see it, such a collector would include two tracing strategies, one
based on per-object type information, and one based on static type
knowledge. Having both types of tracing in a single collector would not,
I think, cause a serious performance degradation, especially as object
models such as Java require both types anyway.
Specific object model details such as the offset of the type info within
an object can be dealt with by implementing the tracing functions in
LLVM IR, so that all of the various inefficiencies caused by
parameterizing the object layout get optimized away. It should be
relatively easy for a frontend to generate the IR for a tracing function
for each concrete class, while the collector library supplies the code
that is called by the tracer for each pointer location in memory.
Tracing functions aren't that complicated anyway, it's the interaction
between the tracer and the mutator is where things get complex, and that
is largely independent of things like object memory layouts.
In other words, I think that it should be possible to create a fairly
comprehensive and efficient collector implementation in which 80% of the
implementation lives in a library, and the language front end generates
the remaining 20%. Moreover, it should be possible to design this
library in a way that makes it possible for people to replace
significant parts of the collector, so that it can be used as a basis
for developing a diverse selection of collection strategies. And because
all of the parts that are specific to the object model are generated by
the front-end, the collector should work for a large number of languages
with different object semantics.
The hardest part will be issues involving concurrency and threading. It
has been my observation that support for concurrency is the one thing
that you can't build up incrementally - that is, with most aspects of a
GC you can start with a simple implementation and gradually add
improvements (multiple generations and so on), but once you add
concurrency you pretty much have to start over. Unfortunately, I am not
a huge concurrent programming whiz, which is part of the reason why I
haven't simply gone and implemented the collector myself. However, this
does suggest to me that if we want to support concurrency (which I
believe we will) then it has to be supported from the very beginning.
As I might have mentioned before on this list, I have written a few
modest components that I plan to use as the basis for a collector which
I plan to name 'Scarcity'. One of these is a heap manager which replaces
malloc/free and which has fairly efficient functions for things like
"free all blocks for which this callback function returns true" which
can be used to implement mark and sweep.
-- Talin
More information about the llvm-dev
mailing list