[LLVMdev] Union Type

Chris Lattner sabre at nondot.org
Fri Dec 19 17:17:02 PST 2003


On Fri, 19 Dec 2003, Reid Spencer wrote:

> As a side effect of bug 178 (Stacker not handling 64-bit pointers on
> Solaris), I got thinking about a union type for LLVM.   Is there any
> good reason that LLVM shouldn't support unions? This is essentially a
> structure that has its members all at the same address rather than at
> sequential addresses. I know there are various issues with unions
> (alignment, etc.) but wouldn't it make sense to provide a union type
> that deals with all those issues in a platform independent way?

This is intentionally not part of the LLVM type-system, because it is
redundant.  If you compile a C program that uses a union, for example, the
C front-end will turn it into a type (often a structure) that contains
only one of the element types (usually the largest one, perhaps modified
to have the correct alignment).

To access the other "parts" of the union, LLVM casts are inserted to
convert it to the appropriate type.

> 3: % foo = union { int, char* };
>
> Number 3 doesn't exist in LLVM and is what I'm proposing.

A union type is not needed if you encode some simple properties of the
target (like the pointer size) into the bytecode file, which we do with
the C/C++ front-end.  The only question then is how to make _portable_
bytecode files with "unions".  I'm not really sure what the answer is
here.

I would really like to avoid adding a new union type, as it is not needed
at the LLVM level, and it seems like high-level languages can map
source-level unions onto existing LLVM operations.  In Stacker, for
example, would this really solve the problem?  You could, for example,
write a program that pushes a pointer, then pops off an int.  This would
work fine on a 32-bit target, but obviously not on a 64-bit one.

Since stacker doesn't "protect" its end users (in a memory safety sense),
I think that a 64-bit stacker target should just push 64-bit pointers like
it would push 64 bit integer types: just take up two slots.  Is there
anything wrong with this approach?  Using a union for the stacker stack on
a 64-bit machine would waste a ton of space when integers are pushed.

> While various tests for word sizes and alignment rules could be used,
> this problem is _gracefully_ handled by unions.  To rewrite the example

The problem with adding unions is that it would require modifying _all of
the LLVM code_ that looks at the type-system, and it doesn't seem like it
gives us anything fundamentally new (like a vector type would).  Also,
forcing the front-end to generate casts is an important feature of the
LLVM type-system: it makes it obvious that something non-type-safe is
happening.

> If anyone thinks that unions are bad ideas, I challenge you to create a
> computer that doesn't support an OR operation. For data structures,
> unions fill the same role: structures are AND, unions are OR. Unions
> only get dicey when they are incorrectly disambiguated .. but that's a
> source language compiler writer's problem.

The problem isn't that we can't effectively represent this, the problem is
that it's not clear what the best way to do it is.  :)

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/





More information about the llvm-dev mailing list