[LLVMdev] ML types in LLVM

Wesley W. Terpstra wesley at terpstra.ca
Sat Jun 13 03:54:06 PDT 2009

Good afternoon!

I'm trying to write an LLVM codegen for a Standard ML compiler
(MLton). So far things seem to match up quite nicely, but I have hit
two sticking points. I'm hoping LLVM experts might know how to handle
these two cases better.

1: In ML we have some types that are actually one of several possible
types. Expressed in C this might be thought of as a union. The codegen
only ever accesses these 'union types' via pointer. Before actually
indexing into the type, it always casts from the 'union pointer type'
to a specific pointer type.

As a concrete example. I have two types %a and %b. I want to express a
third type %c that is either %a* or %b*. Later I'll cast the %c to
either %a* or %b*.

Currently I just represent %c as i8*. I assume that this can have
consequences in terms of aliasing. I tried opaque*, but llvm-as didn't
like that. Is there any way to better represent the type %c to LLVM?

2: In the ML heap we have objects that are garbage collected. Objects
are preceded by a header that describes the object to the garbage
collector. However, pointers to the objects point past the header and
at the actual object. Sometimes, however, the program itself accesses
the header. For example, to determine the length of an array (the
length is in the header). For every type I output it like this:

%opt_33 = { i32, %opt_45*, float }

I could also create another type which includes the header something like:
%opt_33_with_header = {i32, %opt_33 }

Is there any way to express that a pointer is actually a pointer to an
interior element of a type? Something like %opt_33_in_heap =
%opt_33_with_header:1 ?

Currently when I want to read the header of an %opt_33, I cast it to a
i32* and then use getelementptr -1. Is there a better way?

More information about the llvm-dev mailing list