[LLVMdev] ML types in LLVM

Sun Jun 14 12:33:59 PDT 2009

On Jun 13, 2009, at 5:14 PM, Wesley W. Terpstra wrote:
> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com>  
> wrote:
>> Logically, what you want is a distinct LLVM type for every ML union  
>> type
>> and each of its constructors.  Unfortunately, LLVM does structural
>> uniquing of types, so that won't work.
>
> Is there absolutely no way to generate a new type? Not even an  
> 'opaque' one?

As mentioned, you can generate new opaque types, but obviously that
won't work for, say, distinguishing between separate constructors that  
are
structured identically.  If you're not planning to write any LLVM-level
language-specific optimizations, that probably doesn't matter at all.
On the other hand, you were talking about alias analysis, which  
generally
involves writing a pass to inject language-specific information.

>>  What you can do is abuse address
>> spaces, giving every distinct type its own address space and casting
>> back and forth between address spaces as necessary.
>
> The manual indicates that only addresses in space 0 can have GC
> intrinsics used on them.

More casts!  Although I'm curious why this limitation is in effect at  
all;
probably a consequence of some other overloaded use of address
spaces.

> Also I get the impression that this would be a pretty unsafe idea. ;)

Not particularly less safe than all the other unsafe casts you're  
planning
to use.

>> Is there any way to express that a pointer is actually a pointer to  
>> an
>> interior element of a type? Something like %opt_33_in_heap =
>> %opt_33_with_header:1 ?
>>
>> Something like an ungetelementptr?  No, sorry.  That would be a
>> pretty nice extension, though obviously unsound, of course.
>
> Well, ungetelementptr could be nice, but I was hoping for something
> even better: a way to refer to the whole object type (including the
> header) even though my pointer doesn't point to the start of the
> object. Ie: this is a pointer to 8 bytes past type X.

Okay.  You are right, there is no way to express this in the type  
system,
and that is very unlikely to change.

>> Personally, I would create a struct type (hereafter "HeaderType")  
>> for the
>> entire GC header;  when you want to access a header field, just  
>> cast the
>> base pointer to i8*, subtract the allocation size of HeaderType,  
>> cast the
>> result to HeaderType*, and getelementptr from there.
>
> That's what I'm doing right now; the HeaderType happens to be i32. ;)
> I am assuming that casting in and out of i8* will cost me in terms of
> the optimizations LLVM can apply..?

It would only really affect a type-based alias analysis, and there's no
cookie-cutter version of that;  you would need to write your own AA
pass, which could then easily recognize the pattern of accessing the
header.

> Also, I couldn't find a no-op instruction in LLVM. In some places it
> would be convenient to say: '%x = %y'. For the moment I'm doing a
> bitcast from the type back to itself, which is rather awkward.

The bitcast is a decent workaround, but the real question is why you  
need
a no-op at all;  if you're doing it to provide a hook for optimizer
information, a call is probably a better idea.

John.