[LLVMdev] ML types in LLVM

Sat Jun 13 17:14:19 PDT 2009

On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com> wrote:
> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote:
> Currently I just represent %c as i8*. I assume that this can have
> consequences in terms of aliasing. I tried opaque*, but llvm-as didn't
> like that. Is there any way to better represent the type %c to LLVM?
>
> I assume this is for tagged sums.

Yes.

> Logically, what you want is a distinct LLVM type for every ML union type
> and each of its constructors.  Unfortunately, LLVM does structural
> uniquing of types, so that won't work.

Is there absolutely no way to generate a new type? Not even an 'opaque' one?

> What you can do is abuse address
> spaces, giving every distinct type its own address space and casting
> back and forth between address spaces as necessary.

The manual indicates that only addresses in space 0 can have GC
intrinsics used on them. Also I get the impression that this would be
a pretty unsafe idea. ;)

> Is there any way to express that a pointer is actually a pointer to an
> interior element of a type? Something like %opt_33_in_heap =
> %opt_33_with_header:1 ?
>
> Something like an ungetelementptr?  No, sorry.  That would be a
> pretty nice extension, though obviously unsound, of course.

Well, ungetelementptr could be nice, but I was hoping for something
even better: a way to refer to the whole object type (including the
header) even though my pointer doesn't point to the start of the
object. Ie: this is a pointer to 8 bytes past type X.

That way for normal access I punch down to the object part of the type
and do my business. For access to the header, I just punch into that
part of the type (which happens to involve a negative offset from the
address). However, it seems that LLVM pointers always have to point to
the start of an object.

> Personally, I would create a struct type (hereafter "HeaderType") for the
> entire GC header;  when you want to access a header field, just cast the
> base pointer to i8*, subtract the allocation size of HeaderType, cast the
> result to HeaderType*, and getelementptr from there.

That's what I'm doing right now; the HeaderType happens to be i32. ;)
I am assuming that casting in and out of i8* will cost me in terms of
the optimizations LLVM can apply..?

Also, I couldn't find a no-op instruction in LLVM. In some places it
would be convenient to say: '%x = %y'. For the moment I'm doing a
bitcast from the type back to itself, which is rather awkward.