[LLVMdev] [PATCH] - Union types, attempt 2
Chris Lattner
clattner at apple.com
Mon Jan 18 13:11:58 PST 2010
> Using a union here (as opposed to using bitcast) solves a number of
> problems:
>
> 1) The size of the struct is automatically calculated by taking the
> largest field of the union. Without unions, your frontend would have
> to calculate the size of each possible field, as well as their
> alignment, and use that to figure the maximum structure size. If
> your front-end is target-agnostic, you may not even know how to
> calculate the correct struct size.
>
> 2) The struct is small enough to be returned as a first-class SSA
> value, and with a union you can use it directly. Since bitcast only
> works on pointers, in order to use it you would have to alloca some
> temporary memory to hold the function result, store the result into
> it, then use a combination of GEP and bitcast to get a correctly-
> typed pointer to the second field, and finally load the value. With
> a union, you can simply extract the second field without ever having
> to muck about with pointers and allocas.
>
> 3) The union provides an additional layer of type safety, since you
> can only extract types which are declared in the union, and not any
> arbitrary type that you could get with a bitcast. (Although I
> consider this a relatively minor point since type safety isn't a
> major concern in IR.)
>
> 4) It's possible that some future version of the optimizer could use
> the additional type information provided by the union which the
> bitcast does not. Perhaps an optimizer which knows that all of the
> union members are numbers and not pointers could make some
> additional assumptions...
>
> 5) Something I forgot to mention - by allowing GEP and extractvalue
> to work with unions, we can handle unions nested inside structs and
> vice versa with a single GEP instruction. This is my main argument
> against having special instructions for dealing with unions.
>
> For example, in the case of { i1, union { float, i32 } }* we can use
> a GEP with indices [0, 1, 0] to get access to the float field in a
> single GEP instruction.
>
> So just as GEP allows chaining together operations on structs,
> pointers and arrays, we can also chain them together with operations
> on unions. This can be quite powerful I think.
>
Yes, this is all very compelling to me. Beyond all this, we don't
support bitcast of aggregate values.
-Chris
More information about the llvm-dev
mailing list