Quick question - should unions enforce that all member types are unique? I realize that a union of { i32, i32 } doesn't make sense, but should the code actually forbid this?<div><br></div><div>As far as constants go, as long as the initializer is an exact match for one of the member types, it should be no problem.<br>
<br><div class="gmail_quote">On Fri, Jan 8, 2010 at 11:00 PM, Chris Lattner <span dir="ltr"><<a href="mailto:clattner@apple.com">clattner@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div style="word-wrap:break-word"><br><div><div class="im"><div>On Jan 6, 2010, at 12:45 PM, Talin wrote:</div><br><blockquote type="cite">This patch adds a UnionType to DerivedTypes.h.</blockquote><div><br></div></div><div>
Cool. When proposing an IR extension, it is usually best to start with a LangRef.html patch so that we can discuss the semantics of the extension. Please do write this before you get much farther. I assume that you want unions usable in the same situations as a struct. However, how do "constant unions" work? How do I initialize a global variable whose type is "union of float and i32" for example?</div>
<div class="im"><div><br></div><blockquote type="cite"> It also adds code to the bitcode reader / writer and the assembly parser for the new type, as well as a tiny .ll test file in test/Assembler. It does not contain any code related to code generation or type layout - I wanted to see if this much was acceptable before I proceeded any further.</blockquote>
<div><br></div></div>The .ll file isn't included in your patch, but I see that you chose a syntax of 'union { i32, float }' which seems very reasonable.</div><div><br></div><div><div class="im"><blockquote type="cite">
<div>Unlike my previous patch, in which the Union type was implemented as a packing option to type Struct (thereby re-using the machinery of Struct), this patch defines Union as a completely separate type from Struct.</div>
</blockquote><div><br></div></div>I think this approach makes sense. It means that things like TargetData StructLayout won't work for unions, but you don't need them since all elements are at offset 0.</div><div class="im">
<div><br><blockquote type="cite">
<div>I was a little uncertain as to how to write the tests. I'd particularly like to write tests for the bitcode reader/writer stuff, but I am not sure how to do that.</div></blockquote></div><div><br></div></div>A reasonable example are tests like test/Feature/newcasts.ll<div>
<div><br></div><div>Here are some thoughts on your patch:</div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+class UnionType : public CompositeType {</div><div><font face="Menlo" size="2"><span style="font-size:10px">...</span></font></div>
</div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ /// UnionType::get - Create an empty union type.</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ ///</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ static UnionType *get(LLVMContext &Context) {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ return get(Context, std::vector<const Type*>());</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ }</div><div><font face="Menlo" size="2"><span style="font-size:10px"><br>
</span></font></div>
<div><font face="Menlo" size="2"><span style="font-size:10px">I don't think that an empty union is going to be important enough to add a special accessor for it. Is an empty union ever a useful thing to do? If you completely disallow them from IR, it would end up simplifying some things. We don't allow empty vectors, and it seems that an empty union has exactly the same semantics as an empty struct, so having both empty structs and empty unions doesn't seem necessary.</span></font></div>
<div><font face="Menlo" size="2"><span style="font-size:10px"><br></span></font></div><div><font face="Menlo" size="2"><span style="font-size:10px"><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ static UnionType *get(LLVMContext &Context, </div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ const std::vector<const Type*> &Params);</div><div>
<br></div><div>Since this is new code, please have the constructor method take a 'const Type*const* Elements, unsigned NumElements' instead of requiring the caller to make an std::vector. This allows use of SmallVector etc. It is desirable to do this for all the other type classes in DerivedTypes.h, but we haven't gotten around to doing it yet.</div>
<div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ /// UnionType::get - This static method is a convenience method for</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ /// creating union types by specifying the elements as arguments.</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ /// Note that this method always returns a non-packed struct. To get</div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ /// an empty struct, pass NULL, NULL.</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ static UnionType *get(LLVMContext &Context, </div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ const Type *type, ...) END_WITH_NULL;</div><div><br></div><div>Please update the comments. Also, if you disallow empty unions here, you don't need to pass a context.</div>
<div><br></div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+++ include/llvm/Type.h<span style="white-space:pre"> </span>(working copy)</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
@@ -86,6 +86,7 @@</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px"> PointerTyID, ///< 12: Pointers</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
OpaqueTyID, ///< 13: Opaque: type with unknown structure</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px"> VectorTyID, ///< 14: SIMD 'packed' format, or other vector type</div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ UnionTyID, ///< 15: Unions</div><div><br></div><div>Please put this up next to Struct for simplicity, the numbering here doesn't need to be stable. The numbering in llvm-c/Core.h does need to be stable though.</div>
<div><br></div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+bool UnionType::indexValid(const Value *V) const {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ // Union indexes require 32-bit integer constants.</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ if (V->getType() == Type::getInt32Ty(V->getContext()))</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
<br></div><div>Please use V->getType()->isInteger(32) which is probably new since you started your patch.</div><div><br></div></div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+UnionType::UnionType(LLVMContext &C, const std::vector<const Type*> &Types)</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ : CompositeType(C, UnionTyID) {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ ContainedTys = reinterpret_cast<PATypeHandle*>(this + 1);</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ NumContainedTys = Types.size();</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ bool isAbstract = false;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ for (unsigned i = 0; i < Types.size(); ++i) {</div><div><br></div><div>No need to evaluate Types.size() every time through the loop.</div>
<div><br></div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+bool LLParser::ParseUnionType(PATypeHolder &Result) {</div><div>...</div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ if (!EatIfPresent(lltok::lbrace)) {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ return Error(EltTyLoc, "'{' expected after 'union'");</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ }</div><div><br></div><div>Please use:</div><div> if (ParseToken(lltok::lbrace, "'{' expected after 'union'")) return true;</div><div><br></div><div><br></div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ EltTyLoc = Lex.getLoc();</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ if (ParseTypeRec(Result)) return true;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ ParamsList.push_back(Result);</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ if (Result->isVoidTy())</div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ return Error(EltTyLoc, "union element can not have void type");</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ if (!UnionType::isValidElementType(Result))</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ return Error(EltTyLoc, "invalid element type for union");</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ while (EatIfPresent(lltok::comma)) {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ EltTyLoc = Lex.getLoc();</div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ if (ParseTypeRec(Result)) return true;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ if (Result->isVoidTy())</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ return Error(EltTyLoc, "union element can not have void type");</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ if (!UnionType::isValidElementType(Result))</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ return Error(EltTyLoc, "invalid element type for union");</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ ParamsList.push_back(Result);</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ }</div><div>
<br>
</div><div>This can be turned into a:</div><div><br></div><div> do {</div><div> ...</div><div> } while (EatIfPresent(lltok::comma));</div><div><br></div><div>loop to avoid the duplication of code.</div><div><br></div><div>
<div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+++ lib/Bitcode/Writer/BitcodeWriter.cpp<span style="white-space:pre"> </span>(working copy)</div><div>...</div><div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ case Type::UnionTyID: {</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ const UnionType *ST = cast<UnionType>(T);</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ // UNION: [eltty x N]</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ Code = bitc::TYPE_CODE_UNION;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ // Output all of the element types.</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ for (StructType::element_iterator I = ST->element_begin(),</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ E = ST->element_end(); I != E; ++I)</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ TypeVals.push_back(VE.getTypeID(*I));</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
+ AbbrevToUse = UnionAbbrev;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ break;</div><div style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">+ }</div>
<div><br></div><div>Please rename ST -> UT and use the right iterator type.</div><div><br></div><div>I didn't look closely at the C bindings. If you eliminate empty unions they should get a bit simpler.</div><div>
<br></div><div>Otherwise the patch looks like a fine start. Lets please get the LangRef spec ironed out, then you can start committing subsystems to support this. My biggest concern about this extension is updating all the places in the optimizer to know about it. To get adequate testing coverage on this, we should probably switch llvm-gcc or clang to start using the union type in at least some common case, which will allow us to get coverage on it through the optimizer.</div>
<div><br></div><div>Thanks for working on this Talin!</div><div><br></div><div>-Chris</div><div><br></div></div></div><div><br></div></div><div><br></div><div><br></div><div><br></div></div></div><div><br></div><div><br>
</div>
<div><br></div><div><br></div></div><div><br></div><div><br></div></div></div></span></font></div></div></div></div></blockquote></div><br><br clear="all"><br>-- <br>-- Talin<br>
</div>