[PATCH] IR: Add COMDATs to the IR

Rafael Ávila de Espíndola rafael.espindola at gmail.com
Wed Jun 11 07:48:01 PDT 2014

This patch doesn't update lib/Linker. This means that LTO can miscompiple this, no? In particular, I think it will be wrong given something like

$foo = comdat any
@a = weak global i32, 0, comdat $foo
@b = private global i32 0, comdat $foo

in two files. It will keep only one @a, but will duplicate @b create a comdat with an extra element.

Updating lib/Linker should be fairly simple for 'any'. It should just avoid linking any a symbol if it is in a comdat that is already present in the destination. It is not clear what it should do for other selection kind or mismatched selection kinds (error?)

Comment at: docs/LangRef.rst:664
@@ -659,1 +663,3 @@
+Additionally, the function can placed in a comdat if the target has the
+necessary support.
The repeated "if the target has the necessary supports" is a bit redundant. Just talk about the target differences in the comdat section.

Comment at: docs/LangRef.rst:685
@@ -678,3 +684,3 @@
How do alias fit in this design?

We should probably not allow alias to cross comdats, so how about adding something like:

Aliases are placed on the same comdat as the aliassee expression computes to.

Somewhat related: The llvm codegen will currently create comdats on its own. For example, a simple weak global gets output on a comdat. Do you intend to change that? In any case, it is probably a good thing to document.

Comment at: docs/LangRef.rst:758
@@ +757,3 @@
+Note that not all targets support COMDATs and some may only support ``any`` as a
+selection kind.
We only have 3 file formats. It is probably OK to say that ELF only supports 'any' :-)

Comment at: docs/LangRef.rst:771
@@ -717,1 +770,3 @@
+   }
 .. _namedmetadatastructure:
An important difference between COFF and ELF is that in COFF only one section can be in a COMDAT. In ELF, multiple sections can be in the same comdat.

I noticed that you left associative out of the above list. Is the intention to write code that look like ELF (multiple sections in on COMDAT) but at codegen we pick one to actually be in that comdat and make others associative with it?

It is probably a good idea to mention this, since the documentation gives the impression that adding COMDAT to a global makes sure that it will be in that COMDAT.

What are we doing about linkage? For COMDATs to work, we have to make sure their contents are not modified. For example, if a COMDAT has two linkonce_odr symbols, dropping only one can cause a llink failure.

One way would be to say that isDiscardableIfUnused symbols are not allowed, but we do want to allow private. Going this way would require spiting the "can be dropped" notion out of linkage. For example, linkonce would become "weak dropable" and a plain private would have to be kept. We do want to allow entire COMDATs to be dropped, so the comdat itself would probably need the dropable bit too.

A simpler but potentially brittle option is to just say that
* A symbol in a comdat cannot be dropped individually
* A comdat can be dropped if every symbol in it can be dropped.

And audit pretty much every call to isDiscardableIfUnused.

Comment at: include/llvm/IR/Comdat.h:32
@@ +31,3 @@
+class Comdat : public Value, public ilist_node<Comdat> {
Why do you need it to be a Value? This says that a Comdat has a type, which seems wrong.

Comment at: include/llvm/IR/Comdat.h:46
@@ +45,3 @@
+  /// \brief Get the module that holds this section.
+  Module *getParent() { return Parent; }

Comment at: include/llvm/IR/Comdat.h:51
@@ +50,3 @@
+  /// Methods for support type inquiry through isa, cast, and dyn_cast:
+  static inline bool classof(const Value *V) {
+    return V->getValueID() == Value::ComdatVal;
What is this needed for?

Comment at: include/llvm/IR/Module.h:599
@@ +598,3 @@
+  comdat_iterator       comdat_begin()          { return ComdatList.begin(); }
+  const_comdat_iterator comdat_begin() const    { return ComdatList.begin(); }
+  comdat_iterator       comdat_end  ()          { return ComdatList.end();   }
Is this need? I looks a bit strange say that comdat is "part of a module". They are really a composite property of the globals.

Comment at: lib/AsmParser/LLParser.cpp:539
@@ +538,3 @@
+  case lltok::kw_exactmatch:   SK = Comdat::ExactMatch;   break;
+  case lltok::kw_largest:      SK = Comdat::Largest;      break;
+  case lltok::kw_newest:       SK = Comdat::Newest;       break;
Does clang-format agrees with having the break in the same line?

Comment at: lib/AsmParser/LLParser.cpp:1161
@@ +1160,3 @@
+/// GetComdatVal - Get a Comdat with the specified name, creating a forward
+/// reference record if needed.
Don't repeat the name in the comment. Functions start with a lowecase letter.

Comment at: lib/IR/AsmWriter.cpp:1315
@@ -1310,1 +1314,3 @@
+  // Output all comdats.
+  if (!M->comdat_empty()) Out << '\n';
This keeps unused comdats. It is probably better to keep track of which comdats are needed and output only those. This is similar to how an unused type is dropped.

Comment at: lib/IR/Globals.cpp:115
@@ +114,3 @@
+  if (isa<GlobalVariable>(this) || isa<GlobalAlias>(this)) {
+    // Comdats on declarations are semantically meaningless, ignore them.
+    if (isDeclaration())
Why not reject instead? This probably indicates a bug in the calling code, no?


More information about the llvm-commits mailing list