[LLVMdev] [PATCH] - Union types, attempt 2

Talin viridia at gmail.com
Fri Jan 15 15:19:40 PST 2010

On Fri, Jan 15, 2010 at 3:13 PM, Talin <viridia at gmail.com> wrote:

> On Fri, Jan 15, 2010 at 11:02 AM, Dan Gohman <gohman at apple.com> wrote:
>> On Jan 13, 2010, at 12:11 PM, Talin wrote:
>> >
>> > It depends on whether or not unions can be passed around as SSA values
>> or not. I can think of situations where you would want to.
>> I'm skeptical that you *really* want to (i.e. that you wouldn't
>> be better off just writing helper functions in your front-end
>> which do the addressing and load/store and then moving on).
>> But, I'm not really interested in getting in the way here.
>> Let me give you a use case then:
> Say I have a function which returns either a floating-point number or an
> error code (like divide by zero or something). The way that I would
> represent this return result is:
>    { i1, union { float, i32 } }
> In other words, what we have is a small struct that contains a one-bit
> discriminator field, followed by a union of float and i32. The discriminator
> field tells us what type is stored in the union - 0 = float, 1 = i32, so
> this is a typical 'tagged' union. (We can also have untagged or "C-style"
> unions, as long as the programmer has some other means of knowing what type
> is stored in the union.)
> Using a union here (as opposed to using bitcast) solves a number of
> problems:
> 1) The size of the struct is automatically calculated by taking the largest
> field of the union. Without unions, your frontend would have to calculate
> the size of each possible field, as well as their alignment, and use that to
> figure the maximum structure size. If your front-end is target-agnostic, you
> may not even know how to calculate the correct struct size.
> 2) The struct is small enough to be returned as a first-class SSA value,
> and with a union you can use it directly. Since bitcast only works on
> pointers, in order to use it you would have to alloca some temporary memory
> to hold the function result, store the result into it, then use a
> combination of GEP and bitcast to get a correctly-typed pointer to the
> second field, and finally load the value. With a union, you can simply
> extract the second field without ever having to muck about with pointers and
> allocas.
> 3) The union provides an additional layer of type safety, since you can
> only extract types which are declared in the union, and not any arbitrary
> type that you could get with a bitcast. (Although I consider this a
> relatively minor point since type safety isn't a major concern in IR.)
> 4) It's possible that some future version of the optimizer could use the
> additional type information  provided by the union which the bitcast does
> not. Perhaps an optimizer which knows that all of the union members are
> numbers and not pointers could make some additional assumptions...
> 5) Something I forgot to mention - by allowing GEP and extractvalue to work
with unions, we can  handle unions nested inside structs and vice versa with
a single GEP instruction. This is my main argument against having special
instructions for dealing with unions.

For example, in the case of { i1, union { float, i32 } }* we can use a GEP
with indices [0, 1, 0] to get access to the float field in a single GEP

So just as GEP allows chaining together operations on structs, pointers and
arrays, we can also chain them together with operations on unions. This can
be quite powerful I think.

-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100115/296521a1/attachment.html>

More information about the llvm-dev mailing list