[LLVMdev] Union Type

Fri Dec 19 17:42:01 PST 2003

On Fri, 2003-12-19 at 15:32, Chris Lattner wrote:

> This is intentionally not part of the LLVM type-system, because it is
> redundant.  If you compile a C program that uses a union, for example, the
> C front-end will turn it into a type (often a structure) that contains
> only one of the element types (usually the largest one, perhaps modified
> to have the correct alignment).
> 
> To access the other "parts" of the union, LLVM casts are inserted to
> convert it to the appropriate type.

Okay, that's fair. IMO a union type would make compiler writing easier,
but I understand the minimalist approach that LLVM needs to maintain.

> A union type is not needed if you encode some simple properties of the
> target (like the pointer size) into the bytecode file, which we do with
> the C/C++ front-end.  The only question then is how to make _portable_
> bytecode files with "unions".  I'm not really sure what the answer is
> here.

Me either :(

> I would really like to avoid adding a new union type, as it is not needed
> at the LLVM level, and it seems like high-level languages can map
> source-level unions onto existing LLVM operations.  In Stacker, for
> example, would this really solve the problem?  You could, for example,
> write a program that pushes a pointer, then pops off an int.  This would
> work fine on a 32-bit target, but obviously not on a 64-bit one.

Stack mistakes resulting from the Stacker source, as in this case, are
the problem of the Stacker programmer, not the compiler. Its possible
but not valid to do the operation you suggested. If you push a pointer,
you should pop it with something that expects a pointer and knows how to
use it correctly. The issue in my mind is by how much one increments the
stack index when a pointer is pushed. The answer in the current
implementation is always "1". That works fine on a 32-bit platform
because the stack array element is 32-bits (int). Both ints and pointers
fit in 32-bits and incrementing the index by 1 moves the index by
32-bits. When you move to a machine that has 64-bit pointers and 32-bit
ints, then this needs to change so that the index is incremented by 2
when a pointer is pushed. There's a number of ways to solve this, the
union type is one of them but I understand your reasons for not wanting
it in the LLVM Assembly language.

> 
> Since stacker doesn't "protect" its end users (in a memory safety sense),
> I think that a 64-bit stacker target should just push 64-bit pointers like
> it would push 64 bit integer types: just take up two slots.  Is there
> anything wrong with this approach?

Nope. In fact, what I think I'll implement is just a 64-bit stack. That
is, the base type of the stack will be "long" instead of int. LLVM
assures me that this is 64-bits. It is large enough to hold a pointer on
all supported platforms. By doing this, I don't have to mess with the
index increment, its still always 1. This also has the added advantage
of increasing the range of integer values and better supporting floating
point should that become a future feature. 

> Using a union for the stacker stack on
> a 64-bit machine would waste a ton of space when integers are pushed.

Why? The union would still be 64-bits long. If we increase the integer
value size to 64-bits it won't be a waste, it'll be a "feature" :)

> The problem with adding unions is that it would require modifying _all of
> the LLVM code_ that looks at the type-system, and it doesn't seem like it
> gives us anything fundamentally new (like a vector type would).  Also,
> forcing the front-end to generate casts is an important feature of the
> LLVM type-system: it makes it obvious that something non-type-safe is
> happening.

Okay! I'm convinced!

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20031219/e8b3851c/attachment.sig>