[LLVMdev] [PATCH] - Union types, attempt 2

Nick Lewycky nicholas at mxc.ca
Fri Jan 15 21:44:01 PST 2010

Dustin Laurence wrote:
> On 01/15/2010 11:37 AM, Talin wrote:
>>      Yes, that's closer to the frontend semantics: the variants of a
>>      union type don't have any natural ordering, so list semantics could
>>      cause problems.
> I agree.  I probably shouldn't even comment, as I know so little about
> LLVM.  But I've hand-written a couple kLOC of IR now and am starting to
> get a feel for the syntax, so I'll just say what "feels" right based on
> that and leave it to others to decide if I've absorbed enough to make
> any kind of sense.
> Just imagining myself using such a language extension, I really would
> not want an ordering imposed where no natural one exists.  Indices feel
> very wrong.  Isn't a union basically just a convenient alternate
> interface to the various other conversion operators like bitcast,
> inttoptr, trunc, zext, and the rest?

Almost, but you're forgetting one important attribute: you can 'alloca' 
a union type and get something the size of the largest entry. This way, 
you can allocate a union of {i32, i8} and i8* without knowing in your 
frontend whether your system has 32 or 64-bit pointers. This is 
important to people who want to write fully platform neutral code in LLVM.


> (In fact that's how I manipulate
> my expressions, the three-bit tag in the low-order bits tell me how to
> treat the high-order bits.)  The "index" doesn't (generally) represent
> any kind of offset, but rather an interpretation of the bits, and none
> of the offset arithmetic implied by getelementptr or physical register
> choice implied by extractvalue will occur (except perhaps to satisfy
> alignment constraints, but that would be architecture dependent and I
> assume should therefore be invisible).  Correct?
> If that argument is persuasive, then the following seems a bit more
> consistent with the existing syntax:
>       ; Manipulation of a union register variable
>       %myUnion = unioncast i32, %myValue to union {i32, float}
>       %fieldValue = unioncast union {i32, float} %myUnion to i32
>       ; %fieldValue == %myValue
> This specialized union cast fits the pattern of having specialized cast
> operations between value and pointer as opposed to two values or two
> pointers.
> That's enough, as you could require that unions be loaded and stored as
> unions and then elements extracted.  But if you want to make it a bit
> less syntactically noisy, and also allow the same flexibility that
> getelementptr would allow in accessing a single member through a
> pointer, you could allow
>       ; Load/store of one particular union field
>       store i32 %myValue, union {i32, float}* %myUnionPtr
>       %fieldValue = load union {i32, float}* %myUnionPtr as i32
>       ; %fieldValue == %myValue
> Where I've added a preposition 'as' to the load instruction by analogy
> with what the cast operators do with 'to'.
> I don't know that I'd argue the point much, but offhand it "feels"
> consistent with the rest of the syntax to have a specialized 'unioncast'
> operator analogous with the other specialized conversions, but overload
> load/store as I illustrated so that pointers to unions are conceptually
> just funny kinds of pointers to their fields (which they are).  So in
> that vein, if you want a pointer to one of the alternatives in the union
> you'd just cast one pointer to another; to avoid alignment adjustments
> on what is supposed to be a no-op that cast probably shouldn't be
> bitcast.  So what about
>       %intPtr = unioncast union {i32, float}* %myUnionPtr to i32*
>       %newUnionPtr = unioncast i32* %intPtr to union {i32, float}*
>       ; %newUnionPtr == %myUnionPtr
> I'm not necessarily advocating overloading one keyword ('unioncast')
> that way, though I note that it should always be unambiguous based on
> whether the operands are values or pointers (LLVM seems to have a strong
> notion of what is and is not a pointer, so this makes some kind of
> conceptual sense to me).  Whether it's OK to create two new keywords is
> perhaps too fine a detail for me to have a good sense of.  What would
> matter to me is not imposing order on unordered interpretations.
> Dustin
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

More information about the llvm-dev mailing list