[LLVMdev] getting closer!

Tue Apr 22 12:20:42 PDT 2008

On Apr 21, 2008, at 6:23 PM, Gordon Henriksen wrote:

> On Apr 21, 2008, at 20:09, Terence Parr wrote:
>
>> Ok, I *might* be getting this from the assembly code.  ...  From
>> that, it will push/pop in functions?  If so, that's easy enough. :)
>
> Yup! Sounds like you've got it.

Yup, what i was missing and what somebody should add to the doc is  
that "shadow-stack" adds a preamble/postamble snippet to each function  
that must bind with

StackEntry *llvm_gc_root_chain;

wherever you choose to define it.  I put into my GC.c file.

Further, that shadow-stack snippet generation assumes the following  
structures for tracking roots:

typedef struct FrameMap FrameMap;
struct FrameMap {
   int32_t NumRoots; // Number of roots in stack frame.
   int32_t NumMeta;  // Number of metadata descriptors. May be <  
NumRoots.
   void *Meta[];     // May be absent for roots without metadata.
};

typedef struct StackEntry StackEntry;
struct StackEntry {
   StackEntry *Next;       // Caller's stack entry.
   const FrameMap *Map;    // Pointer to constant FrameMap.
   void *Roots[];          // Stack roots (in-place array).
};

The doc says compiler / runtime must agree, but not what the structs  
are...Seems like those few lines above would make everything clear.  I  
don't have write access to svn, but I plan on a big chapter full of  
ANTLR -> LLVM examples in my DSL problem solving book.

>> What I was/am missing is the explicit link between types and
>> variables in a GC.c file and the generated machine code.  If I can
>> get that last explicit link, I'm off to the races.
>
> You mean, how do you know what sort of object you're tracing?

I assumed that I needed to generate my object maps or at least a list  
of pointers for each object type.  Related to that, i have two  
important questions:

1. How do I know the offset (due to alignment/padding by LLVM) of a  
pointer within an object using {...} struct type?  GEP instruction  
gets an address, of course, but how does my C collector compute  
these.  Do I need to make a metadata struct and fill it with GEP  
instructions?  I guess that makes sense.

2. How do I know how big a struct is?  I have my gc_allocate() method  
but I have no idea how big the struct will be; i see now sizeof.   
Alignment makes it unclear how big something is; it's >= size of  
elements like i32 but how much bigger than packed struct is it?  I.e.,

%struct.A = type {i32 x, [10 x i32]*}

define void @foo() gc "shadow-stack" {
     %s = alloca %struct.A ; this knows how big struct.A is
     %a = call i32* @llvm_gc_allocate(i32 11); this does not know. is  
it 11 or more?
     ret void
}

> You've
> got 3 options here…
>
> • If you have an type tree (as in Java or .NET), you can assume that
> every root starts with a pointer to object metadata, which should
> naturally include GC tracing information.

That's what I plan on.

> • If you have a type forest (as in C or C++) with optional vtables,
> then no such assumption is possible, and you can include type layout
> information in the %metadata parameter to @llvm.gcroot. The FrameMap
> type includes this data.

Ok, so I pass it an arbitrary struct pointer and it just gives it back  
later for me to peruse, right?

>
> • You can tag values, as in lisp or many functional languages. (e.g.,
> integer values have the low bit set, pointers do not.) All fields in a
> block must be of a uniform size, and you'll still need to know how
> many words in a block.

Good to know.

> This decision is completely agnostic to the decision to use the shadow
> stack, or something more efficient.

Yup. makes sense.

Sorry for the long questions...gotta figure this out.

Ter