[LLVMdev] GC heap implementation
Talin
viridia at gmail.com
Mon Feb 11 22:51:47 PST 2008
I'd like to share some code with anyone who is interested.
A while back I started working on the problem of implementing a garbage
collector using the LLVM primitives. While I managed to accomplish a
fair bit in this direction, I eventually realized that it was
distracting me from my main goal, which is implementing a compiler for a
strongly-typed language of my own design. In other words, I was writing
a GC implementation because I needed one, not necessarily because I
wanted to write one. I decided to push the GC project to the back burner
and focus on my main goal, on the theory that while that was happening
the GC problem might be worked on by someone else.
Although my GC code was initially intended to support my runtime, I've
lately been using it in the compiler. I realized that in my case, I
couldn't keep on using llvm::BumpPtrAllocator because I couldn't make
guarantees about the lifetime of AST nodes in my compiler, and if I used
a malloc-based allocator I would have to do something like reference
counting.
Instead, I took one component of my GC - the general heap implementation
- and built a trivial mark & sweep collector on top of it, which seems
to be working very well for my purpose.
This GC heap was to be the foundation for my collector. It's loosely
inspired by the popular dlmalloc implementation (although it doesn't
directly take any code from it), but it also supports a number of
features useful for implementing a collector, such as the ability to
efficiently walk the heap and free blocks based on a predicate callback.
It also reserves space in the 8-byte allocation header for an opaque
type tag pointer (or debug string if you don't need tags) as well as
room for 2 mark bits. It supports aligned allocations, which is useful
for doing work with SIMD instructions. It supports compile-time
configurable guard bytes, small block bins, and other things you would
expect in a modern allocator. The heap is thread-safe, and uses only
pthread primitives for locking. It allows the heap to be non-contiguous,
so if you have other libraries that are calling regular libc malloc you
can still coexist with them. The code is written in fairly-well
commented ANSI C and includes a number of unit tests.
In any case, this code is currently lying fallow, so if anyone is
interested in working with it I will be glad to donate it.
-- Talin
More information about the llvm-dev
mailing list