[LLVMdev] Interfacing llvm with a precise, relocating GC
Philip Reames
listmail at philipreames.com
Mon Oct 28 11:03:35 PDT 2013
On 10/26/13 7:40 AM, Filip Pizlo wrote:
> You can implement a copying GC (what the kids these days call
> relocating) without accurate roots.
I use "relocating" to brush over the distinction between "copying" and
"compacting" collectors. For the purposes of our discussions, the two
are interchangeable though.
> Why aren't you just using the well-known Bartlett style of GC, which
> allows for relocation even in the presence of conservative roots (or
> accurate roots that don't allow relocation)?
You've brought this up a few times, so I want to go ahead and address
your point in detail.
Background for others following this discussion: A Bartlett style
garbage collector conservatively scans the stack and promotes* any
object reachable from a possible heap pointer in place. Objects
directly reachable from the stack are not relocated. Objects reachable
from these objects are treated normally by the relocating collector and
may be moved. To support in place promotion, a small amount of extra
heap metadata is generally required.
* "Promote" is used to indicate the object is conceptually moved into
the "to space". This is not the same promotion used when discussing
generational collectors.
I agree that Bartlett collectors have significant advantages with
regards to compatibility with legacy code and compilers. A Bartlett
collector does not require derived pointer tracking inside the
compiler. A Bartlett collector simplifies interfacing with code in
other languages (i.e. C/C++) since it has far fewer assumptions about
stack layout.
Unfortunately, Bartlett collectors also have a number of serious
disadvantages. To name a few:
- False Retention - Due to conservative stack scanning, objects which
are not live may be falsely retained. (i.e. false roots or dead roots)
- External Fragmentation - Promoted-in-place objects can be anywhere in
the heap. This introduces fragmentation and complicates both allocation
and collection.
- Internal Fragmentation - Promoted-in-place objects cause the retention
of the entire "page" containing them. Unless using a free-list style
allocator, all of this space is inaccessible.
- Garbage Drag - Since falsely retained data can point to other heap
objects, an unknown fraction of the heap may be falsely kept live. This
is particularly problematic for programs with extreme heap usage.
Consider a program which maintains a large B-tree in memory with
updates. Or a program manipulating large graphs. In either case, a
single stray pointer can force the retention of large portions of the
heap graph. (In practice, this does happen.)
I do acknowledge that each of these is individually surmountable.
However, doing so requires significant work beyond the basic design
described in Bartlett's initial work. Given the complexity of
engineering a production grade garbage collector, this additional
complexity is highly undesirable.
Taking a step back from the technical merits, I have one other
overriding reason for not pursuing a Bartlett style approach: we already
have a fully precise relocating collector. We are not interested in
abandoning this. From our side, the only question is whether LLVM is
suitable for use with our existing collector.
I'm happy to keep debating the merits of various approaches with you
(this is fun!), but from a practical perspective, using an alternate
approach to garbage collection is simply a non-starter for us.
Yours,
Philip
More information about the llvm-dev
mailing list