[LLVMdev] Interfacing llvm with a precise, relocating GC

Mon Oct 28 14:52:35 PDT 2013

On Oct 28, 2013, at 11:03 AM, Philip Reames <listmail at philipreames.com> wrote:

> On 10/26/13 7:40 AM, Filip Pizlo wrote:
>> You can implement a copying GC (what the kids these days call relocating) without accurate roots.
> I use "relocating" to brush over the distinction between "copying" and "compacting" collectors.  For the purposes of our discussions, the two are interchangeable though.
>> Why aren't you just using the well-known Bartlett style of GC, which allows for relocation even in the presence of conservative roots (or accurate roots that don't allow relocation)?
> You've brought this up a few times, so I want to go ahead and address your point in detail.
> 
> Background for others following this discussion: A Bartlett style garbage collector conservatively scans the stack and promotes* any object reachable from a possible heap pointer in place.  Objects directly reachable from the stack are not relocated.  Objects reachable from these objects are treated normally by the relocating collector and may be moved.   To support in place promotion, a small amount of extra heap metadata is generally required.
> 
> * "Promote" is used to indicate the object is conceptually moved into the "to space".  This is not the same promotion used when discussing generational collectors.
> 
> I agree that Bartlett collectors have significant advantages with regards to compatibility with legacy code and compilers.  A Bartlett collector does not require derived pointer tracking inside the compiler.  A Bartlett collector simplifies interfacing with code in other languages (i.e. C/C++) since it has far fewer assumptions about stack layout.
> 
> Unfortunately, Bartlett collectors also have a number of serious disadvantages.  To name a few:
> - False Retention - Due to conservative stack scanning, objects which are not live may be falsely retained.  (i.e. false roots or dead roots)

I don't know of evidence that this is a problem in practice.  Bartlett collectors have been used (and are being used) in shipping products.  The good ole' IBM production VM used it, and we use it in WebKit.  Some other - lesser known, less widely used - projects have also used it, like the Modula 3 runtime.

> - External Fragmentation - Promoted-in-place objects can be anywhere in the heap.  This introduces fragmentation and complicates both allocation and collection.

I only know of one example of this being a problem: if you're trying to build a 32-bit VM that is intended to be used with 4GB heaps (i.e. you are maxing out your virtual address space) and the mutator intends to use most of the available memory for a single huge array.

In a 64-bit (err, 48-bit) address space that is common these days, this ain't gonna happen.

> - Internal Fragmentation - Promoted-in-place objects cause the retention of the entire "page" containing them.  Unless using a free-list style allocator, all of this space is inaccessible.

I don't know of evidence that this is a problem in practice.  The heap is much huger than the stack(s) - often 3 orders of magnitude.  Bartlett collectors usually run with small-ish "pages" - 4KB or not much bigger.  In order to waste a significant amount of heap space due to internal fragmentation, you'd have to find yourself in a situation where each slot on your stacks points to a distinct heap page.  The probability of such a thing happening is probably on the same order of magnitude as the probability of a cosmic ray frying your RAM.  Anyway, the empirical evidence appears to suggest that false retention is negligible.

And yes, I have often considered implementing free-listing for those pages.  But right now I view this as a solution looking for a problem.

> - Garbage Drag - Since falsely retained data can point to other heap objects, an unknown fraction of the heap may be falsely kept live. This is particularly problematic for programs with extreme heap usage.  Consider a program which maintains a large B-tree in memory with updates.  Or a program manipulating large graphs.  In either case, a single stray pointer can force the retention of large portions of the heap graph.  (In practice, this does happen.)

I don't know of evidence that this ever happens.

> 
> I do acknowledge that each of these is individually surmountable. However, doing so requires significant work beyond the basic design described in Bartlett's initial work.  Given the complexity of engineering a production grade garbage collector, this additional complexity is highly undesirable.

Are you suggesting that others who have shipped Bartlett-based collectors were in fact failing to ship a production-grade collector, or that they actually shipped something more sophisticated than a Bartlett collector?

WebKit doesn't attempt to get around the disadvantages that you listed, because these disadvantages are purely theoretical.  We track memory use bugs all the time and we haven't found any such bug that can be traced to conservative roots.

I don't know to what extent IBM's production VM used Bartlett "to the letter", or if they made some changes to it.

You've brought up some of theoretical disadvantages of Bartlett collectors and you refer to them as being "serious".  I am curious what empirical evidence you are using to support the claim that these disadvantages are anything but philosophical?  Remember, GCs are not a theoretician's game.  This is an empirical science and production collectors tend to be based on techniques that arose through experiments.

> 
> Taking a step back from the technical merits, I have one other overriding reason for not pursuing a Bartlett style approach: we already have a fully precise relocating collector.  We are not interested in abandoning this.  From our side, the only question is whether LLVM is suitable for use with our existing collector.

I can appreciate this and I do not mean to imply that accurate GC shouldn't be considered at all.  It's an interesting compiler feature.

But I want to make sure that such a discussion is grounded in reality: adding accurate GC to a compiler is hard, and you don't need it to do copying.  You can build a production-quality garbage collector without it.

> 
> I'm happy to keep debating the merits of various approaches with you (this is fun!), but from a practical perspective, using an alternate approach to garbage collection is simply a non-starter for us.
> 
> Yours,
> Philip