[LLVMdev] Suggestions for improvements to the GC docs.

Gordon Henriksen gordonhenriksen at me.com
Thu Mar 5 18:05:50 PST 2009


Hi Talin,

Thanks for the feedback. My comments inline.

On 2009-03-05, at 16:01, Talin wrote:

> I'm re-re-reading the "Accurate Garbage Collection with LLVM", and  
> I'm realizing that there are some parts of this document I find  
> confusing.
>
> 1) I think that the term 'stack map' should be defined more  
> precisely. For example, in one place it says "LVM automatically  
> computes a stack map", and elsewhere it says "The compiler plugin is  
> responsible for generating code which conforms to the binary  
> interface defined by library, most essentially the stack map". At  
> first glance, this seems contradictory - who is generating the stack  
> map, LLVM, or the compiler plugin? The problem is that the words  
> "stack map" are being used to refer to two different things, one  
> which is computed by LLVM, the other generated by the compiler plugin.

There is a conflation of terms there, yes. I've tried to clarify, but  
they are merely different representations of the same information.

> Also, a definition of what a stack map is, and what it contains,  
> would be helpful.

I've added more detail to the 'Computing stack maps' section.

> 2) It says LLVM does not address "discovery or registration of stack  
> frame descriptors." I'm not sure what is meant by a "stack frame  
> descriptor". Is this the same as a stack map?

It does. I've expunged the term "descriptor" from the document for  
consistency.

> 3) The shadow-stack collector is presented as "an easy way to get  
> started". However, it doesn't give many hints as to what the next  
> logical evolutionary step would be - what method would you normally  
> use for crawling the various stack frames if you didn't want to pay  
> the performance penalty of maintaining the shadow stack? Something  
> like libunwind perhaps?

I've added some links to the plugin section at the conclusion here;  
hopefully that scratches the itch. I don't think LLVM should attempt  
to prescribe a GC growth path, nor should this document attempt to be  
a substitute for domain knowledge.

> 4) A more general question: In the barrier intrinsics, is there any  
> constraint on the values that the object pointer and the derived  
> pointer can take?

There is no constraint on the relationship between the object and  
derived pointer parameters except as the plugin requires. I've made  
this explicit. There is currently no benefit to using gcread/gcwrite  
vs. coding in the barrier up front.

> In particular, I am thinking about the case where you have an  
> object, such as an ArrayList, in which there are two separate non- 
> contiguous allocations: A fixed-length "header" part, and a variable- 
> length "tail" part. Assume that the header part uses type tags, but  
> the tail part is just a raw data buffer. From a GC perspective, it  
> may be simplest to treat these as a single object, such that only  
> the head part gets added to the work queue, and both are traced by  
> the same tracer function. This implies, however, that when calling  
> gcwrite(), the 'derived' pointer might be in a different memory  
> block than the 'object' pointer, and may even be located at a  
> negative offset from it.

Java and .NET treat the list header as an object which contains  
reference to a (fixed length) array object—i.e., 2 separate objects.  
On the principal of least surprise, I would recommend sticking to this  
model if you're creating building blocks.

— Gordon



More information about the llvm-dev mailing list