[LLVMdev] Suggestion: Support union types in IR

Nick Lewycky nicholas at mxc.ca
Tue Dec 30 15:02:15 PST 2008


Talin wrote:
> I've been thinking about how to represent unions or "disjoint types" 
> in LLVM IR. At the moment, the only way I know to achieve this right 
> now is to create a struct that is as large as the largest type in the 
> union and then bitcast it to access the fields contained within. 
> However, that requires that the frontend know the sizes of all of the 
> various low-level types (the "size_t" problem, which has been 
> discussed before), otherwise you get problems trying to mix pointer 
> and non-pointer types.
It has to come down to a fixed size at some point, using target-specific 
knowledge. Is there any reason you couldn't use 'opaque' here and 
resolve it once you have the necessary information?

Nick

> It seems to me that adding a union type to the IR would be a logical 
> extension to the language. The syntax for declaring a union would be 
> similar to that of declaring a struct. To access a union member, you 
> would use GetElementPointer, just as if it were a struct. The only 
> difference is that in this case, the GEP doesn't actually modify the 
> address, it merely returns the input argument as a different type. In 
> all other ways, unions would be treated like structs, except that the 
> size of the union would always be the size of the largest member, and 
> all of the fields within the union would be located located at 
> relative offset zero.
>
> One issue would be how to represent the union in text form. One 
> alternative is to use the same syntax for structs, except replace the 
> comma with a vertical bar character, as in this example:
>
>     {int|float}
>
> The drawback to this approach is that you can't tell whether it is a 
> struct or union until after you have parsed the first argument. If 
> that turns out to be too inconvenient for the parser, then some unique 
> two-character sequence will have to be used, such as:
>
>    {|int|float|}
>
> Another idea is to use struct syntax with a keyword prefix:
>
>    union {int, float}
>
> The specific details of syntax I will leave to others.
>
> Unions could of course be combined with other types:
>
>    {{int|float}, bool} *
>    n = getelementptr i32 0, i32 0, i32 1
>
> So in the above example, the GEP returns a pointer to the float field.
>
> -- 
> -- Talin
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>   




More information about the llvm-dev mailing list