[lld] r246878 - COFF: Implement a better algorithm for ICF.

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 4 14:35:55 PDT 2015


Author: ruiu
Date: Fri Sep  4 16:35:54 2015
New Revision: 246878

URL: http://llvm.org/viewvc/llvm-project?rev=246878&view=rev
Log:
COFF: Implement a better algorithm for ICF.

Identical COMDAT Folding is a feature to merge COMDAT sections
by contents. Two sections are considered the same if their contents,
relocations, attributes, etc, are all the same.

An interesting fact is that MSVC linker takes "iterations" parameter
for ICF because the algorithm they are using is iterative. Merging
two sections could make more sections to be mergeable because
different relocations could now point to the same section. ICF is
repeated until we get a convergence (until no section can be merged).
This algorithm is not fast. Usually it needs three iterations until a
convergence is obtained.

In the new algorithm implemented in this patch, we consider sections
and relocations as a directed acyclic graph, and we try to merge
sections whose outdegree is zero. Sections with outdegree zero are then
removed from the graph, which makes  other sections to have outdegree
zero. We repeat that until all sections are processed. In this
algorithm, we don't iterate over the same sections many times.

There's an apparent issue in the algorithm -- the section graph is
not guaranteed to be acyclic. It's actually pretty often cyclic.
So this algorithm cannot eliminate all possible duplicates.
That's OK for now because the previous algorithm was not able to
eliminate cycles too. I'll address the issue in a follow-up patch.

Modified:
    lld/trunk/COFF/Chunks.h
    lld/trunk/COFF/ICF.cpp

Modified: lld/trunk/COFF/Chunks.h
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Chunks.h?rev=246878&r1=246877&r2=246878&view=diff
==============================================================================
--- lld/trunk/COFF/Chunks.h (original)
+++ lld/trunk/COFF/Chunks.h Fri Sep  4 16:35:54 2015
@@ -182,6 +182,8 @@ public:
   // with other chunk by ICF, it points to another chunk,
   // and this chunk is considrered as dead.
   SectionChunk *Ptr;
+  int Outdegree = 0;
+  std::vector<SectionChunk *> Ins;
 
   // The CRC of the contents as described in the COFF spec 4.5.5.
   // Auxiliary Format 5: Section Definitions. Used for ICF.

Modified: lld/trunk/COFF/ICF.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=246878&r1=246877&r2=246878&view=diff
==============================================================================
--- lld/trunk/COFF/ICF.cpp (original)
+++ lld/trunk/COFF/ICF.cpp Fri Sep  4 16:35:54 2015
@@ -90,31 +90,53 @@ bool SectionChunk::equals(const SectionC
   return std::equal(Relocs.begin(), Relocs.end(), X->Relocs.begin(), Eq);
 }
 
+static void link(SectionChunk *From, SectionChunk *To) {
+  ++From->Outdegree;
+  To->Ins.push_back(From);
+}
+
 // Merge identical COMDAT sections.
-// Two sections are considered as identical when their section headers,
+// Two sections are considered the same if their section headers,
 // contents and relocations are all the same.
 void doICF(const std::vector<Chunk *> &Chunks) {
-  std::unordered_set<SectionChunk *, Hasher, Equals> Set;
-  bool Redo;
-  do {
-    Set.clear();
-    Redo = false;
-    for (Chunk *C : Chunks) {
-      auto *SC = dyn_cast<SectionChunk>(C);
-      if (!SC || !SC->isCOMDAT() || !SC->isLive())
-        continue;
+  std::vector<SectionChunk *> SChunks;
+  for (Chunk *C : Chunks)
+    if (auto *SC = dyn_cast<SectionChunk>(C))
+      if (SC->isCOMDAT() && SC->isLive())
+        SChunks.push_back(SC);
+
+  // Initialize SectionChunks' outdegrees and in-chunk lists.
+  for (SectionChunk *SC : SChunks) {
+    for (SectionChunk *C : SC->children())
+      link(SC, C);
+    for (SymbolBody *B : SC->symbols())
+      if (auto *D = dyn_cast<DefinedRegular>(B))
+        link(SC, D->getChunk());
+  }
+
+  // By merging two sections, more sections can become mergeable
+  // because two originally different relocations can now point to
+  // the same section. We process sections whose outdegree is zero
+  // first to deal with that.
+  for (;;) {
+    std::unordered_set<SectionChunk *, Hasher, Equals> Set;
+    auto Pred = [](SectionChunk *SC) { return SC->Outdegree > 0; };
+    auto Bound = std::partition(SChunks.begin(), SChunks.end(), Pred);
+    if (Bound == SChunks.end())
+      return;
+    for (auto It = Bound, E = SChunks.end(); It != E; ++It) {
+      SectionChunk *SC = *It;
       auto P = Set.insert(SC);
       bool Inserted = P.second;
       if (Inserted)
         continue;
       SectionChunk *Existing = *P.first;
       SC->replaceWith(Existing);
-      // By merging sections, two relocations that originally pointed to
-      // different locations can now point to the same location.
-      // So, repeat the process until a convegence is obtained.
-      Redo = true;
+      for (SectionChunk *In : SC->Ins)
+        --In->Outdegree;
     }
-  } while (Redo);
+    SChunks.erase(Bound, SChunks.end());
+  }
 }
 
 } // namespace coff




More information about the llvm-commits mailing list