[LLVMdev] Improving Garbage Collection

Thu Jul 7 14:22:19 PDT 2011

If we want to enumerate the refs of a call stack, we call a runtime function, prtYoungestActivation, and that gives us back a stack iterator for the sequence of activations.  The iterator contains the value for esp and pointers to the memory locations where ebp, eip, edi, esi, and ebx were last saved to the stack.  (For architectures other than 32-bit x86, the register-specific contents of the activation would of course be defined differently.)  In the eip case, this is saved for every activation as part of the call sequence and so provides you with the ability to determine which call site you are at.  The other registers may have been saved by the current frame or a subsequent frame.  The stack iterator also contains a virtual frame number that basically allows an unwinder to see logical activations instead of physical activations that may combine multiple logical activations via inlining.

That all makes sense.

I would say that your framework is at a somewhat higher level than I would expect LLVM to support - by that, I mean that ideally it should be possible to use the primitives that LLVM supplies to build exactly what you have described. My own use case has slightly different requirements, and as a result I could not adopt your framework as it is - but I see no reason why I should not be able to use the same LLVM primitives to build what I need.

The main difference has to do with the handling of disjoint types in my language. For example, if I declare a variable using the following syntax:
[TAA]
I'm not so sure we couldn't support your case.  If a disjoint type might contain a root then you could define the possible pointer in the struct as a generalized ref having a user-defined type and extraneous data.  That user-defined generalized ref type for you would indicate a disjoint union and the extraneous data could indicate the address of the discriminator in the union.  Whether the data was enregistered or not wouldn't matter in that case.  The GC root callback is passed the user-defined type and extraneous data you declare the field with.

Interesting. Let me think about that. It would indeed be nice if that worked. How would you handle a case where the union member was (String *)[3] instead of just String *?:

[TAA] So it is either an array of ints or an array of strings?  A generalized ref is just another type so you can declare of arrays of them.  Your discriminator could even be a bitmask that indicates which members of the array were ints and which were strings assuming you wanted to mix-and-match rather than all ints or all strings.  Either way it works.
You can then call another runtime function, prtEnumerateRootsOfActivation, to ask for the roots associated with the given stack iterator's current activation to be enumerated.  To this function, you pass a PrtRseInfo struct that contains a callback that gets invoked once per live ref found and a opaque environment variable passed (unexamined) onto the callback function.  To go to the next activation on the call stack, you then call the runtime's prtNextActivation.  There are routines to determine when you've got to the bottom of the call stack as well.  In addition, there are routines for performing other (non-GC) operations the stack iterator's activation, such as looking up data associated with the SPAN construct (taken from C--) or exception handlers.

In my own case, I don't have an iterator for roots within a single activation, rather I have a static data structure that describes how to find all of the roots within the call frame. One of my design goals is to minimize the number of function calls required to trace a pointer, so the different parts of my code accept whole stack maps as arguments instead of single pointers.

However, the LLVM approach to generating stack maps is flexible enough to handle either case - your GCStrategy plugin gets a list of all stack roots for each safe point, and you can transform that into whatever data structures you need.
[TAA]
They seem interchangeable...if you have a data structure you can just as easily write an iterator.  I'm curious if you see this as performance critical.  In my experience, GC times tend to be completely dominated by walking the live objects and not root set enumeration.

Well, I've always been a premature optimizer :) However, I use the same data structure type to describe both stack roots and heap objects - essentially I treat the stack frame as though it were another object (except that the offsets are usually negative).

Oh, one other minor misunderstanding I wanted to clear up: the "llvm.gcroot" intrinsic, as it exists today, is used to mark *values*, not types, as being roots. One of the things I am proposing is to make it so that the marking of roots is type-based rather than value-based. This would, among other benefits, allow function parameters to be roots.

[TAA] I would be in complete agreement here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110707/e8381ac5/attachment.html>