[LLVMdev] [PATCH] - Union types, attempt 2

Mon Jan 18 13:11:58 PST 2010

> Using a union here (as opposed to using bitcast) solves a number of  
> problems:
>
> 1) The size of the struct is automatically calculated by taking the  
> largest field of the union. Without unions, your frontend would have  
> to calculate the size of each possible field, as well as their  
> alignment, and use that to figure the maximum structure size. If  
> your front-end is target-agnostic, you may not even know how to  
> calculate the correct struct size.
>
> 2) The struct is small enough to be returned as a first-class SSA  
> value, and with a union you can use it directly. Since bitcast only  
> works on pointers, in order to use it you would have to alloca some  
> temporary memory to hold the function result, store the result into  
> it, then use a combination of GEP and bitcast to get a correctly- 
> typed pointer to the second field, and finally load the value. With  
> a union, you can simply extract the second field without ever having  
> to muck about with pointers and allocas.
>
> 3) The union provides an additional layer of type safety, since you  
> can only extract types which are declared in the union, and not any  
> arbitrary type that you could get with a bitcast. (Although I  
> consider this a relatively minor point since type safety isn't a  
> major concern in IR.)
>
> 4) It's possible that some future version of the optimizer could use  
> the additional type information  provided by the union which the  
> bitcast does not. Perhaps an optimizer which knows that all of the  
> union members are numbers and not pointers could make some  
> additional assumptions...
>
> 5) Something I forgot to mention - by allowing GEP and extractvalue  
> to work with unions, we can  handle unions nested inside structs and  
> vice versa with a single GEP instruction. This is my main argument  
> against having special instructions for dealing with unions.
>
> For example, in the case of { i1, union { float, i32 } }* we can use  
> a GEP with indices [0, 1, 0] to get access to the float field in a  
> single GEP instruction.
>
> So just as GEP allows chaining together operations on structs,  
> pointers and arrays, we can also chain them together with operations  
> on unions. This can be quite powerful I think.
>

Yes, this is all very compelling to me.  Beyond all this, we don't  
support bitcast of aggregate values.

-Chris