[llvm-commits] RFC: initial union syntax support

Wed May 20 22:55:51 PDT 2009

Hi Nick,

>>> Then it would seem I misunderstood the purpose of unions. I thought 
>>> the problem was that it was impossible to declare a type which would 
>>> be "as large as the largest of any of these" without having accurate 
>>> TargetData. The union type was supposed to do that and nothing more.
>>
>> I sent an example earlier showing that you can do this already without
>> union types.
> 
> Close. Your trick does perform a ptrtoint which requires knowing what
> int size is large enough. Fortunately in your case it's indexing off of
> null so it's very unlikely that it won't fit in 16 bits or less, but
> it's still not as good as a first-class union.

if the union is bigger than accessible memory, then you are going to
be able to allocate one anyway.  Conclusion: doing GEP of null can be
assumed to not result in pointer overflow.  Thus the only problem is
ptrtoint.  You can do ptrtoint to i64 on all platforms, which solves
the problem since you can't alloca an amount that doesn't fit in i64
anyway.  That said, this whole technique is pretty ugly.

> I've been thinking about the original suggestion and the reasons I 
> objected to it. It seems that the original suggest was to think about a 
> union as a structure where the offset into each element is zero instead 
> of being contiguous to each other. That makes the original proposal make 
> a whole lot more sense to me than it did originally.
> 
> Despite Chris' message to the contrary, I still think u{i32, i32} 
> shouldn't be allowed (rather, it should be folded to u{i32} by the 
> getter). We could provide an accessor that returns the element number 
> for a given Type* and the only drawback is that it means doing an extra 
> lookup through a small list. Allowing GEP makes sense, and unions should 
> certainly be first class aggregates.

another possibility is to not introduce new union types, but instead
to enhance the alloca instruction to take a list of types : it would
then allocate enough memory for all of the types in the list.  The
return type could be that of the first type in the list.

Ciao,

Duncan.