[LLVMdev] ML types in LLVM

Sat Jun 13 19:32:43 PDT 2009

Wesley W. Terpstra wrote:
> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com> wrote:
>> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote:
>> Currently I just represent %c as i8*. I assume that this can have
>> consequences in terms of aliasing. I tried opaque*, but llvm-as didn't
>> like that. Is there any way to better represent the type %c to LLVM?
>>
>> I assume this is for tagged sums.
> 
> Yes.
> 
>> Logically, what you want is a distinct LLVM type for every ML union type
>> and each of its constructors.  Unfortunately, LLVM does structural
>> uniquing of types, so that won't work.
> 
> Is there absolutely no way to generate a new type? Not even an 'opaque' one?

Each time you say "opaque" in a .ll (or call OpaqueType::get in the C++ 
API) you get yourself a new distinct opaque type.

It's not clear to me at all why opaque didn't work for you in the first 
place. One thing you'll have to remember is that because of the above, 
if you want to take an opaque* and pass it to another function that 
takes an opaque*, you'll get a type mismatch since you said opaque 
twice. Use "%c = type opaque" in the global space, then %c* to get the 
same opaque in multiple places. The other reason it might not have 
worked for you is that you might've tried to dereference your opaque* 
thereby producing just 'opaque' which isn't allowed.

>>  What you can do is abuse address
>> spaces, giving every distinct type its own address space and casting
>> back and forth between address spaces as necessary.
> 
> The manual indicates that only addresses in space 0 can have GC
> intrinsics used on them. Also I get the impression that this would be
> a pretty unsafe idea. ;)
> 
>> Is there any way to express that a pointer is actually a pointer to an
>> interior element of a type? Something like %opt_33_in_heap =
>> %opt_33_with_header:1 ?
>>
>> Something like an ungetelementptr?  No, sorry.  That would be a
>> pretty nice extension, though obviously unsound, of course.
> 
> Well, ungetelementptr could be nice, but I was hoping for something
> even better: a way to refer to the whole object type (including the
> header) even though my pointer doesn't point to the start of the
> object. Ie: this is a pointer to 8 bytes past type X.
> 
> That way for normal access I punch down to the object part of the type
> and do my business. For access to the header, I just punch into that
> part of the type (which happens to involve a negative offset from the
> address). However, it seems that LLVM pointers always have to point to
> the start of an object.
> 
>> Personally, I would create a struct type (hereafter "HeaderType") for the
>> entire GC header;  when you want to access a header field, just cast the
>> base pointer to i8*, subtract the allocation size of HeaderType, cast the
>> result to HeaderType*, and getelementptr from there.
> 
> That's what I'm doing right now; the HeaderType happens to be i32. ;)
> I am assuming that casting in and out of i8* will cost me in terms of
> the optimizations LLVM can apply..?
> 
> Also, I couldn't find a no-op instruction in LLVM. In some places it
> would be convenient to say: '%x = %y'. For the moment I'm doing a
> bitcast from the type back to itself, which is rather awkward.

There is none, using a bitcast is the workaround. LLVM's optimizers will 
fix it up.

Nick