[lld] r288684 - Use "equivalence class" instead of "color" to describe the concept in ICF.

Mon Dec 5 10:11:36 PST 2016

Author: ruiu
Date: Mon Dec  5 12:11:35 2016
New Revision: 288684

URL: http://llvm.org/viewvc/llvm-project?rev=288684&view=rev
Log:
Use "equivalence class" instead of "color" to describe the concept in ICF.

Also add a citation to GNU gold safe ICF paper.

Differential Revision: https://reviews.llvm.org/D27398

Modified:
    lld/trunk/ELF/ICF.cpp
    lld/trunk/ELF/InputSection.h

Modified: lld/trunk/ELF/ICF.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ICF.cpp?rev=288684&r1=288683&r2=288684&view=diff
==============================================================================

--- lld/trunk/ELF/ICF.cpp (original)
+++ lld/trunk/ELF/ICF.cpp Mon Dec  5 12:11:35 2016
@@ -7,7 +7,7 @@
 //
 //===----------------------------------------------------------------------===//
 //
-// ICF is short for Identical Code Folding. That is a size optimization to
+// ICF is short for Identical Code Folding. This is a size optimization to
 // identify and merge two or more read-only sections (typically functions)
 // that happened to have the same contents. It usually reduces output size
 // by a few percent.
@@ -26,43 +26,50 @@
 //   void bar() { foo(); }
 //
 // If you merge the two, their relocations point to the same section and
-// thus you know they are mergeable, but how do we know they are mergeable
-// in the first place? This is not an easy problem to solve.
+// thus you know they are mergeable, but how do you know they are
+// mergeable in the first place? This is not an easy problem to solve.
 //
-// What we are doing in LLD is some sort of coloring algorithm.
+// What we are doing in LLD is to partition sections into equivalence
+// classes. Sections in the same equivalence class when the algorithm
+// terminates are considered identical. Here are details:
+//
+// 1. First, we partition sections using their hash values as keys. Hash
+//    values contain section types, section contents and numbers of
+//    relocations. During this step, relocation targets are not taken into
+//    account. We just put sections that apparently differ into different
+//    equivalence classes.
+//
+// 2. Next, for each equivalence class, we visit sections to compare
+//    relocation targets. Relocation targets are considered equivalent if
+//    their targets are in the same equivalence class. Sections with
+//    different relocation targets are put into different equivalence
+//    clases.
+//
+// 3. If we split an equivalence class in step 2, two relocations
+//    previously target the same equivalence class may now target
+//    different equivalence classes. Therefore, we repeat step 2 until a
+//    convergence is obtained.
 //
-// We color non-identical sections in different colors repeatedly.
-// Sections in the same color when the algorithm terminates are considered
-// identical. Here are the details:
-//
-// 1. First, we color all sections using their hash values of section
-//    types, section contents, and numbers of relocations. At this moment,
-//    relocation targets are not taken into account. We just color
-//    sections that apparently differ in different colors.
-//
-// 2. Next, for each color C, we visit sections in color C to compare
-//    relocation target colors.  We recolor sections A and B in different
-//    colors if A's and B's relocations are different in terms of target
-//    colors.
-//
-// 3. If we recolor some section in step 2, relocations that were
-//    previously pointing to the same color targets may now be pointing to
-//    different colors. Therefore, repeat 2 until a convergence is
-//    obtained.
-//
-// 4. For each color C, pick an arbitrary section in color C, and merges
-//    other sections in color C with it.
+// 4. For each equivalence class C, pick an arbitrary section in C, and
+//    merge all the other sections in C with it.
 //
 // For small programs, this algorithm needs 3-5 iterations. For large
 // programs such as Chromium, it takes more than 20 iterations.
 //
+// This algorithm was mentioned as an "optimistic algorithm" in [1],
+// though gold implements a different algorithm than this.
+//
 // We parallelize each step so that multiple threads can work on different
-// colors concurrently. That gave us a large performance boost when
-// applying ICF on large programs. For example, MSVC link.exe or GNU gold
-// takes 10-20 seconds to apply ICF on Chromium, whose output size is
-// about 1.5 GB, but LLD can finish it in less than 2 seconds on a 2.8 GHz
-// 40 core machine. Even without threading, LLD's ICF is still faster than
-// MSVC or gold though.
+// equivalence classes concurrently. That gave us a large performance
+// boost when applying ICF on large programs. For example, MSVC link.exe
+// or GNU gold takes 10-20 seconds to apply ICF on Chromium, whose output
+// size is about 1.5 GB, but LLD can finish it in less than 2 seconds on a
+// 2.8 GHz 40 core machine. Even without threading, LLD's ICF is still
+// faster than MSVC or gold though.
+//
+// [1] Safe ICF: Pointer Safe and Unwinding aware Identical Code Folding
+// in the Gold Linker
+// http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36912.pdf
 //
 //===----------------------------------------------------------------------===//
 
@@ -103,10 +110,10 @@ private:
 
   size_t findBoundary(size_t Begin, size_t End);
 
-  void forEachColorRange(size_t Begin, size_t End,
+  void forEachClassRange(size_t Begin, size_t End,
                          std::function<void(size_t, size_t)> Fn);
 
-  void forEachColor(std::function<void(size_t, size_t)> Fn);
+  void forEachClass(std::function<void(size_t, size_t)> Fn);
 
   std::vector<InputSection<ELFT> *> Sections;
 
@@ -116,26 +123,28 @@ private:
   // The main loop counter.
   int Cnt = 0;
 
-  // We have two locations for colors. On the first iteration of the main
-  // loop, Color[0] has a valid value, and Color[1] contains garbage. We
-  // read colors from slot 0 and write to slot 1. So, Color[0] represents
-  // the current color, and Color[1] represents the next color. On each
-  // iteration, they switch the roles, so we use them alternately.
+  // We have two locations for equivalence classes. On the first iteration
+  // of the main loop, Class[0] has a valid value, and Class[1] contains
+  // garbage. We read equivalence classes from slot 0 and write to slot 1.
+  // So, Class[0] represents the current class, and Class[1] represents
+  // the next class. On each iteration, we switch their roles and use them
+  // alternately.
   //
   // Why are we doing this? Recall that other threads may be working on
-  // other colors in parallel. They may read colors that we are updating.
-  // We cannot update colors in place because it breaks the invariance
-  // that all possibly-identical sections must have the same color at any
-  // moment. In other words, the for loop to update colors is not an
-  // atomic operation, and that is observable from other threads. By
-  // writing new colors to write-only places, we can keep the invariance.
+  // other equivalence classes in parallel. They may read sections that we
+  // are updating. We cannot update equivalence classes in place because
+  // it breaks the invariance that all possibly-identical sections must be
+  // in the same equivalence class at any moment. In other words, the for
+  // loop to update equivalence classes is not atomic, and that is
+  // observable from other threads. By writing new classes to other
+  // places, we can keep the invariance.
   //
-  // Below, `Current` has the index of the current color, and `Next` has
-  // the index of the next color. If threading is enabled, they are
-  // either (0, 1) or (1, 0).
+  // Below, `Current` has the index of the current class, and `Next` has
+  // the index of the next class. If threading is enabled, they are either
+  // (0, 1) or (1, 0).
   //
   // Note on single-thread: if that's the case, they are always (0, 0)
-  // because we can safely read next colors without worrying about race
+  // because we can safely read the next class without worrying about race
   // conditions. Using the same location makes this algorithm converge
   // faster because it uses results of the same iteration earlier.
   int Current = 0;
@@ -158,8 +167,7 @@ template <class ELFT> static bool isElig
          S->Name != ".init" && S->Name != ".fini";
 }
 
-// Split a range into smaller ranges by recoloring sections
-// in a given range.
+// Split an equivalence class into smaller classes.
 template <class ELFT>
 void ICF<ELFT>::segregate(size_t Begin, size_t End, bool Constant) {
   // This loop rearranges sections in [Begin, End) so that all sections
@@ -183,10 +191,10 @@ void ICF<ELFT>::segregate(size_t Begin,
     size_t Mid = Bound - Sections.begin();
 
     // Now we split [Begin, End) into [Begin, Mid) and [Mid, End) by
-    // updating the sections in [Begin, End). We use Mid as a color ID
-    // because every group ends with a unique index.
+    // updating the sections in [Begin, End). We use Mid as an equivalence
+    // class ID because every group ends with a unique index.
     for (size_t I = Begin; I < Mid; ++I)
-      Sections[I]->Color[Next] = Mid;
+      Sections[I]->Class[Next] = Mid;
 
     // If we created a group, we need to iterate the main loop again.
     if (Mid != End)
@@ -237,7 +245,7 @@ bool ICF<ELFT>::variableEq(const InputSe
     if (&SA == &SB)
       return true;
 
-    // Or, the two sections must have the same color.
+    // Or, the two sections must be in the same equivalence class.
     auto *DA = dyn_cast<DefinedRegular<ELFT>>(&SA);
     auto *DB = dyn_cast<DefinedRegular<ELFT>>(&SB);
     if (!DA || !DB)
@@ -250,12 +258,12 @@ bool ICF<ELFT>::variableEq(const InputSe
     if (!X || !Y)
       return false;
 
-    // Ineligible sections have the special color 0.
-    // They can never be the same in terms of section colors.
-    if (X->Color[Current] == 0)
+    // Ineligible sections are in the special equivalence class 0.
+    // They can never be the same in terms of the equivalence class.
+    if (X->Class[Current] == 0)
       return false;
 
-    return X->Color[Current] == Y->Color[Current];
+    return X->Class[Current] == Y->Class[Current];
   };
 
   return std::equal(RelsA.begin(), RelsA.end(), RelsB.begin(), Eq);
@@ -271,21 +279,22 @@ bool ICF<ELFT>::equalsVariable(const Inp
 }
 
 template <class ELFT> size_t ICF<ELFT>::findBoundary(size_t Begin, size_t End) {
+  uint32_t Class = Sections[Begin]->Class[Current];
   for (size_t I = Begin + 1; I < End; ++I)
-    if (Sections[Begin]->Color[Current] != Sections[I]->Color[Current])
+    if (Class != Sections[I]->Class[Current])
       return I;
   return End;
 }
 
-// Sections in the same color are contiguous in Sections vector.
-// Therefore, Sections vector can be considered as contiguous groups
-// of sections, grouped by colors.
+// Sections in the same equivalence class are contiguous in Sections
+// vector. Therefore, Sections vector can be considered as contiguous
+// groups of sections, grouped by the class.
 //
 // This function calls Fn on every group that starts within [Begin, End).
 // Note that a group must starts in that range but doesn't necessarily
 // have to end before End.
 template <class ELFT>
-void ICF<ELFT>::forEachColorRange(size_t Begin, size_t End,
+void ICF<ELFT>::forEachClassRange(size_t Begin, size_t End,
                                   std::function<void(size_t, size_t)> Fn) {
   if (Begin > 0)
     Begin = findBoundary(Begin - 1, End);
@@ -297,13 +306,13 @@ void ICF<ELFT>::forEachColorRange(size_t
   }
 }
 
-// Call Fn on each color group.
+// Call Fn on each equivalence class.
 template <class ELFT>
-void ICF<ELFT>::forEachColor(std::function<void(size_t, size_t)> Fn) {
+void ICF<ELFT>::forEachClass(std::function<void(size_t, size_t)> Fn) {
   // If threading is disabled or the number of sections are
   // too small to use threading, call Fn sequentially.
   if (!Config->Threads || Sections.size() < 1024) {
-    forEachColorRange(0, Sections.size(), Fn);
+    forEachClassRange(0, Sections.size(), Fn);
     ++Cnt;
     return;
   }
@@ -315,8 +324,8 @@ void ICF<ELFT>::forEachColor(std::functi
   size_t NumShards = 256;
   size_t Step = Sections.size() / NumShards;
   forLoop(0, NumShards,
-          [&](size_t I) { forEachColorRange(I * Step, (I + 1) * Step, Fn); });
-  forEachColorRange(Step * NumShards, Sections.size(), Fn);
+          [&](size_t I) { forEachClassRange(I * Step, (I + 1) * Step, Fn); });
+  forEachClassRange(Step * NumShards, Sections.size(), Fn);
   ++Cnt;
 }
 
@@ -328,34 +337,32 @@ template <class ELFT> void ICF<ELFT>::ru
       if (isEligible(S))
         Sections.push_back(S);
 
-  // Initially, we use hash values to color sections. Therefore, if
-  // two sections have the same color, they are likely (but not
-  // guaranteed) to have the same static contents in terms of ICF.
+  // Initially, we use hash values to partition sections.
   for (InputSection<ELFT> *S : Sections)
-    // Set MSB to 1 to avoid collisions with non-hash colors.
-    S->Color[0] = getHash(S) | (1 << 31);
+    // Set MSB to 1 to avoid collisions with non-hash IDs.
+    S->Class[0] = getHash(S) | (1 << 31);
 
-  // From now on, sections in Sections are ordered so that sections in
-  // the same color are consecutive in the vector.
+  // From now on, sections in Sections vector are ordered so that sections
+  // in the same equivalence class are consecutive in the vector.
   std::stable_sort(Sections.begin(), Sections.end(),
                    [](InputSection<ELFT> *A, InputSection<ELFT> *B) {
-                     return A->Color[0] < B->Color[0];
+                     return A->Class[0] < B->Class[0];
                    });
 
   // Compare static contents and assign unique IDs for each static content.
-  forEachColor([&](size_t Begin, size_t End) { segregate(Begin, End, true); });
+  forEachClass([&](size_t Begin, size_t End) { segregate(Begin, End, true); });
 
   // Split groups by comparing relocations until convergence is obtained.
   do {
     Repeat = false;
-    forEachColor(
+    forEachClass(
         [&](size_t Begin, size_t End) { segregate(Begin, End, false); });
   } while (Repeat);
 
   log("ICF needed " + Twine(Cnt) + " iterations");
 
-  // Merge sections in the same colors.
-  forEachColor([&](size_t Begin, size_t End) {
+  // Merge sections by the equivalence class.
+  forEachClass([&](size_t Begin, size_t End) {
     if (End - Begin == 1)
       return;
 

Modified: lld/trunk/ELF/InputSection.h
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputSection.h?rev=288684&r1=288683&r2=288684&view=diff
==============================================================================
--- lld/trunk/ELF/InputSection.h (original)
+++ lld/trunk/ELF/InputSection.h Mon Dec  5 12:11:35 2016
@@ -289,7 +289,7 @@ public:
   void relocateNonAlloc(uint8_t *Buf, llvm::ArrayRef<RelTy> Rels);
 
   // Used by ICF.
-  uint32_t Color[2] = {0, 0};
+  uint32_t Class[2] = {0, 0};
 
   // Called by ICF to merge two input sections.
   void replace(InputSection<ELFT> *Other);