[LLVMdev] [PATCH] - Union types, attempt 2

Chris Lattner clattner at apple.com
Fri Jan 8 23:00:25 PST 2010

On Jan 6, 2010, at 12:45 PM, Talin wrote:

> This patch adds a UnionType to DerivedTypes.h.

Cool.  When proposing an IR extension, it is usually best to start with a LangRef.html patch so that we can discuss the semantics of the extension.  Please do write this before you get much farther.  I assume that you want unions usable in the same situations as a struct.  However, how do "constant unions" work?  How do I initialize a global variable whose type is "union of float and i32" for example?

> It also adds code to the bitcode reader / writer and the assembly parser for the new type, as well as a tiny .ll test file in test/Assembler. It does not contain any code related to code generation or type layout - I wanted to see if this much was acceptable before I proceeded any further.

The .ll file isn't included in your patch, but I see that you chose a syntax of 'union { i32, float }' which seems very reasonable.

> Unlike my previous patch, in which the Union type was implemented as a packing option to type Struct (thereby re-using the machinery of Struct), this patch defines Union as a completely separate type from Struct.

I think this approach makes sense.  It means that things like TargetData StructLayout won't work for unions, but you don't need them since all elements are at offset 0.

> I was a little uncertain as to how to write the tests. I'd particularly like to write tests for the bitcode reader/writer stuff, but I am not sure how to do that.

A reasonable example are tests like test/Feature/newcasts.ll

Here are some thoughts on your patch:

+class UnionType : public CompositeType {
+  /// UnionType::get - Create an empty union type.
+  ///
+  static UnionType *get(LLVMContext &Context) {
+    return get(Context, std::vector<const Type*>());
+  }

I don't think that an empty union is going to be important enough to add a special accessor for it.  Is an empty union ever a useful thing to do?  If you completely disallow them from IR, it would end up simplifying some things.  We don't allow empty vectors, and it seems that an empty union has exactly the same semantics as an empty struct, so having both empty structs and empty unions doesn't seem necessary.

+  static UnionType *get(LLVMContext &Context, 
+                        const std::vector<const Type*> &Params);

Since this is new code, please have the constructor method take a 'const Type*const* Elements, unsigned NumElements' instead of requiring the caller to make an std::vector.  This allows use of SmallVector etc.  It is desirable to do this for all the other type classes in DerivedTypes.h, but we haven't gotten around to doing it yet.

+  /// UnionType::get - This static method is a convenience method for
+  /// creating union types by specifying the elements as arguments.
+  /// Note that this method always returns a non-packed struct.  To get
+  /// an empty struct, pass NULL, NULL.
+  static UnionType *get(LLVMContext &Context, 
+                        const Type *type, ...) END_WITH_NULL;

Please update the comments. Also, if you disallow empty unions here, you don't need to pass a context.

+++ include/llvm/Type.h	(working copy)
@@ -86,6 +86,7 @@
     PointerTyID,     ///< 12: Pointers
     OpaqueTyID,      ///< 13: Opaque: type with unknown structure
     VectorTyID,      ///< 14: SIMD 'packed' format, or other vector type
+    UnionTyID,       ///< 15: Unions

Please put this up next to Struct for simplicity, the numbering here doesn't need to be stable.  The numbering in llvm-c/Core.h does need to be stable though.

+bool UnionType::indexValid(const Value *V) const {
+  // Union indexes require 32-bit integer constants.
+  if (V->getType() == Type::getInt32Ty(V->getContext()))

Please use V->getType()->isInteger(32) which is probably new since you started your patch.

+UnionType::UnionType(LLVMContext &C, const std::vector<const Type*> &Types)
+  : CompositeType(C, UnionTyID) {
+  ContainedTys = reinterpret_cast<PATypeHandle*>(this + 1);
+  NumContainedTys = Types.size();
+  bool isAbstract = false;
+  for (unsigned i = 0; i < Types.size(); ++i) {

No need to evaluate Types.size() every time through the loop.

+bool LLParser::ParseUnionType(PATypeHolder &Result) {
+  if (!EatIfPresent(lltok::lbrace)) {
+    return Error(EltTyLoc, "'{' expected after 'union'");
+  }

Please use:
  if (ParseToken(lltok::lbrace, "'{' expected after 'union'")) return true;

+  EltTyLoc = Lex.getLoc();
+  if (ParseTypeRec(Result)) return true;
+  ParamsList.push_back(Result);
+  if (Result->isVoidTy())
+    return Error(EltTyLoc, "union element can not have void type");
+  if (!UnionType::isValidElementType(Result))
+    return Error(EltTyLoc, "invalid element type for union");
+  while (EatIfPresent(lltok::comma)) {
+    EltTyLoc = Lex.getLoc();
+    if (ParseTypeRec(Result)) return true;
+    if (Result->isVoidTy())
+      return Error(EltTyLoc, "union element can not have void type");
+    if (!UnionType::isValidElementType(Result))
+      return Error(EltTyLoc, "invalid element type for union");
+    ParamsList.push_back(Result);
+  }

This can be turned into a:

 do {
 } while (EatIfPresent(lltok::comma));

loop to avoid the duplication of code.

+++ lib/Bitcode/Writer/BitcodeWriter.cpp	(working copy)
+    case Type::UnionTyID: {
+      const UnionType *ST = cast<UnionType>(T);
+      // UNION: [eltty x N]
+      Code = bitc::TYPE_CODE_UNION;
+      // Output all of the element types.
+      for (StructType::element_iterator I = ST->element_begin(),
+           E = ST->element_end(); I != E; ++I)
+        TypeVals.push_back(VE.getTypeID(*I));
+      AbbrevToUse = UnionAbbrev;
+      break;
+    }

Please rename ST -> UT and use the right iterator type.

I didn't look closely at the C bindings.  If you eliminate empty unions they should get a bit simpler.

Otherwise the patch looks like a fine start.  Lets please get the LangRef spec ironed out, then you can start committing subsystems to support this.  My biggest concern about this extension is updating all the places in the optimizer to know about it.  To get adequate testing coverage on this, we should probably switch llvm-gcc or clang to start using the union type in at least some common case, which will allow us to get coverage on it through the optimizer.

Thanks for working on this Talin!


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100108/0abf33b1/attachment.html>

More information about the llvm-dev mailing list