[LLVMdev] [PATCH] - Union types, attempt 2
Dustin Laurence
dllaurence at dslextreme.com
Fri Jan 15 13:51:36 PST 2010
On 01/15/2010 11:37 AM, Talin wrote:
> Yes, that's closer to the frontend semantics: the variants of a
> union type don't have any natural ordering, so list semantics could
> cause problems.
I agree. I probably shouldn't even comment, as I know so little about
LLVM. But I've hand-written a couple kLOC of IR now and am starting to
get a feel for the syntax, so I'll just say what "feels" right based on
that and leave it to others to decide if I've absorbed enough to make
any kind of sense.
Just imagining myself using such a language extension, I really would
not want an ordering imposed where no natural one exists. Indices feel
very wrong. Isn't a union basically just a convenient alternate
interface to the various other conversion operators like bitcast,
inttoptr, trunc, zext, and the rest? (In fact that's how I manipulate
my expressions, the three-bit tag in the low-order bits tell me how to
treat the high-order bits.) The "index" doesn't (generally) represent
any kind of offset, but rather an interpretation of the bits, and none
of the offset arithmetic implied by getelementptr or physical register
choice implied by extractvalue will occur (except perhaps to satisfy
alignment constraints, but that would be architecture dependent and I
assume should therefore be invisible). Correct?
If that argument is persuasive, then the following seems a bit more
consistent with the existing syntax:
; Manipulation of a union register variable
%myUnion = unioncast i32, %myValue to union {i32, float}
%fieldValue = unioncast union {i32, float} %myUnion to i32
; %fieldValue == %myValue
This specialized union cast fits the pattern of having specialized cast
operations between value and pointer as opposed to two values or two
pointers.
That's enough, as you could require that unions be loaded and stored as
unions and then elements extracted. But if you want to make it a bit
less syntactically noisy, and also allow the same flexibility that
getelementptr would allow in accessing a single member through a
pointer, you could allow
; Load/store of one particular union field
store i32 %myValue, union {i32, float}* %myUnionPtr
%fieldValue = load union {i32, float}* %myUnionPtr as i32
; %fieldValue == %myValue
Where I've added a preposition 'as' to the load instruction by analogy
with what the cast operators do with 'to'.
I don't know that I'd argue the point much, but offhand it "feels"
consistent with the rest of the syntax to have a specialized 'unioncast'
operator analogous with the other specialized conversions, but overload
load/store as I illustrated so that pointers to unions are conceptually
just funny kinds of pointers to their fields (which they are). So in
that vein, if you want a pointer to one of the alternatives in the union
you'd just cast one pointer to another; to avoid alignment adjustments
on what is supposed to be a no-op that cast probably shouldn't be
bitcast. So what about
%intPtr = unioncast union {i32, float}* %myUnionPtr to i32*
%newUnionPtr = unioncast i32* %intPtr to union {i32, float}*
; %newUnionPtr == %myUnionPtr
I'm not necessarily advocating overloading one keyword ('unioncast')
that way, though I note that it should always be unambiguous based on
whether the operands are values or pointers (LLVM seems to have a strong
notion of what is and is not a pointer, so this makes some kind of
conceptual sense to me). Whether it's OK to create two new keywords is
perhaps too fine a detail for me to have a good sense of. What would
matter to me is not imposing order on unordered interpretations.
Dustin
More information about the llvm-dev
mailing list