[LLVMdev] Suggestion: Support union types in IR
Talin
viridia at gmail.com
Tue May 5 20:09:27 PDT 2009
I wanted to mention, by the way, that my need/desire for this hasn't
gone away :)
And my wish list still includes support for something like uintptr_t - a
primitive integer type that is defined to always be the same size as a
pointer, however large or small that may be on different platforms. (So
that the frontend doesn't need to know how big a pointer is and can
generate the same IR that works on both 32-bit and 64-bit platforms.)
-- Talin
Chris Lattner wrote:
> On Dec 30, 2008, at 12:41 PM, Talin wrote:
>
>> I've been thinking about how to represent unions or "disjoint types"
>> in LLVM IR. At the moment, the only way I know to achieve this right
>> now is to create a struct that is as large as the largest type in
>> the union and then bitcast it to access the fields contained within.
>> However, that requires that the frontend know the sizes of all of
>> the various low-level types (the "size_t" problem, which has been
>> discussed before), otherwise you get problems trying to mix pointer
>> and non-pointer types.
>>
>
> That's an interesting point. As others have pointed out, we've
> resisted having a union type because it isn't strictly needed for the
> current set of front-ends. If a front-end is trying to generate
> target-independent IR though, I can see the utility. The "gep trick"
> won't work for type generation.
>
>
>> It seems to me that adding a union type to the IR would be a logical
>> extension to the language. The syntax for declaring a union would be
>> similar to that of declaring a struct. To access a union member, you
>> would use GetElementPointer, just as if it were a struct. The only
>> difference is that in this case, the GEP doesn't actually modify the
>> address, it merely returns the input argument as a different type.
>> In all other ways, unions would be treated like structs, except that
>> the size of the union would always be the size of the largest
>> member, and all of the fields within the union would be located
>> located at relative offset zero.
>>
>
> Yes, your proposal makes sense, for syntax, I'd suggest: u{ i32, float}
>
>
>> Unions could of course be combined with other types:
>>
>> {{int|float}, bool} *
>> n = getelementptr i32 0, i32 0, i32 1
>>
>> So in the above example, the GEP returns a pointer to the float field.
>>
>
> I don't have a specific problem with adding this. The cost of doing
> so is that it adds (a small amount of) complexity to a lot of places
> that walk the type graphs. The only pass that I predict will be
> difficult to update to handle this is the BasicAA pass, which reasons
> about symbolic (not concrete) offsets and should return mustalias in
> the appropriate cases. Also, to validate this, I think llvm-gcc
> should start generating this for C unions where possible.
>
> If you're interested in implementing this and seeing all the details
> of the implementation through to the end, I don't see significant
> problems. I think adding a simple union type would make more sense
> than adding first-class support for a *discriminated* union.
>
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
More information about the llvm-dev
mailing list