On Fri, Jan 15, 2010 at 3:13 PM, Talin <span dir="ltr"><<a href="mailto:viridia@gmail.com">viridia@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div><div></div><div class="h5">On Fri, Jan 15, 2010 at 11:02 AM, Dan Gohman <span dir="ltr"><<a href="mailto:gohman@apple.com" target="_blank">gohman@apple.com</a>></span> wrote:<br></div></div><div class="gmail_quote">
<div><div></div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><br>
On Jan 13, 2010, at 12:11 PM, Talin wrote:<br>
><br>
</div><div>> It depends on whether or not unions can be passed around as SSA values or not. I can think of situations where you would want to.<br>
<br>
</div>I'm skeptical that you *really* want to (i.e. that you wouldn't<br>
be better off just writing helper functions in your front-end<br>
which do the addressing and load/store and then moving on).<br>
But, I'm not really interested in getting in the way here.<br>
<font color="#888888"><br></font></blockquote></div></div><div>Let me give you a use case then:</div><div><br></div><div>Say I have a function which returns either a floating-point number or an error code (like divide by zero or something). The way that I would represent this return result is:</div>
<div><br></div><div> { i1, union { float, i32 } }</div><div><br></div><div>In other words, what we have is a small struct that contains a one-bit discriminator field, followed by a union of float and i32. The discriminator field tells us what type is stored in the union - 0 = float, 1 = i32, so this is a typical 'tagged' union. (We can also have untagged or "C-style" unions, as long as the programmer has some other means of knowing what type is stored in the union.)</div>
<div><br></div><div>Using a union here (as opposed to using bitcast) solves a number of problems:</div><div><br></div><div>1) The size of the struct is automatically calculated by taking the largest field of the union. Without unions, your frontend would have to calculate the size of each possible field, as well as their alignment, and use that to figure the maximum structure size. If your front-end is target-agnostic, you may not even know how to calculate the correct struct size.</div>
<div><br></div><div>2) The struct is small enough to be returned as a first-class SSA value, and with a union you can use it directly. Since bitcast only works on pointers, in order to use it you would have to alloca some temporary memory to hold the function result, store the result into it, then use a combination of GEP and bitcast to get a correctly-typed pointer to the second field, and finally load the value. With a union, you can simply extract the second field without ever having to muck about with pointers and allocas.</div>
<div><br></div><div>3) The union provides an additional layer of type safety, since you can only extract types which are declared in the union, and not any arbitrary type that you could get with a bitcast. (Although I consider this a relatively minor point since type safety isn't a major concern in IR.)</div>
<div><br></div><div>4) It's possible that some future version of the optimizer could use the additional type information provided by the union which the bitcast does not. Perhaps an optimizer which knows that all of the union members are numbers and not pointers could make some additional assumptions...</div>
<div><br></div></div></blockquote><div>5) Something I forgot to mention - by allowing GEP and extractvalue to work with unions, we can handle unions nested inside structs and vice versa with a single GEP instruction. This is my main argument against having special instructions for dealing with unions.</div>
<div><br></div><div>For example, in the case of { i1, union { float, i32 } }* we can use a GEP with indices [0, 1, 0] to get access to the float field in a single GEP instruction.</div><div><br></div><div>So just as GEP allows chaining together operations on structs, pointers and arrays, we can also chain them together with operations on unions. This can be quite powerful I think.</div>
<div> </div></div>-- <br>-- Talin<br>