[llvm-commits] RFC: initial union syntax support

Wed May 20 23:42:25 PDT 2009

Duncan Sands wrote:
> Hi Nick,
> 
>>>> Then it would seem I misunderstood the purpose of unions. I thought 
>>>> the problem was that it was impossible to declare a type which would 
>>>> be "as large as the largest of any of these" without having accurate 
>>>> TargetData. The union type was supposed to do that and nothing more.
>>>
>>> I sent an example earlier showing that you can do this already without
>>> union types.
>>
>> Close. Your trick does perform a ptrtoint which requires knowing what
>> int size is large enough. Fortunately in your case it's indexing off of
>> null so it's very unlikely that it won't fit in 16 bits or less, but
>> it's still not as good as a first-class union.
> 
> if the union is bigger than accessible memory, then you are going to
> be able to allocate one anyway.  Conclusion: doing GEP of null can be
> assumed to not result in pointer overflow.  Thus the only problem is
> ptrtoint.  You can do ptrtoint to i64 on all platforms, which solves
> the problem since you can't alloca an amount that doesn't fit in i64
> anyway.  That said, this whole technique is pretty ugly.

Sure.

>> I've been thinking about the original suggestion and the reasons I 
>> objected to it. It seems that the original suggest was to think about 
>> a union as a structure where the offset into each element is zero 
>> instead of being contiguous to each other. That makes the original 
>> proposal make a whole lot more sense to me than it did originally.
>>
>> Despite Chris' message to the contrary, I still think u{i32, i32} 
>> shouldn't be allowed (rather, it should be folded to u{i32} by the 
>> getter). We could provide an accessor that returns the element number 
>> for a given Type* and the only drawback is that it means doing an 
>> extra lookup through a small list. Allowing GEP makes sense, and 
>> unions should certainly be first class aggregates.
> 
> another possibility is to not introduce new union types, but instead
> to enhance the alloca instruction to take a list of types : it would
> then allocate enough memory for all of the types in the list.  The
> return type could be that of the first type in the list.

That's only good for stack variables. It doesn't work for globals.

Nick