<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 1, 2016 at 11:45 AM, Rui Ueyama via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Author: ruiu<br>
Date: Thu Dec 1 13:45:22 2016<br>
New Revision: 288409<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=288409&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=288409&view=rev</a><br>
Log:<br>
Updates file comments and variable names.<br>
<br>
Use "color" instead of "group id" to describe the ICF algorithm.<br></blockquote><div><br></div><div>The right term is "congruence class"; I think you should use it. This ICF algorithm is basically a simple "optimistic" GVN/CSE algorithm; all values are initially assumed to be in the same congruence class and then that equivalence class is iteratively split as contradictions are found until there are no contradictions.</div><div><br></div><div>For example, look at llvm/lib/Transforms/Scalar/EarlyCSE.cpp and llvm/lib/Transforms/Scalar/GVN.cpp and <a href="https://reviews.llvm.org/D26224">https://reviews.llvm.org/D26224</a> (NewGVN) for similar algorithms, although (I haven't looked super closely at them, but I doubt either is fully optimistic like ICF) and they are much more complex because they have to deal with issues like control flow; ICF has no analogous issue. So this ICF algorithm is actually one of the simplest possible GVN/CSE algorithms.</div><div><br></div><div>(for example, look at all the `equals` methods in <a href="https://reviews.llvm.org/D26224">https://reviews.llvm.org/D26224</a>; the core loop is in NewGVN::runGVN)</div><div><br></div><div>-- Sean Silva</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Modified:<br>
lld/trunk/ELF/ICF.cpp<br>
lld/trunk/ELF/InputSection.h<br>
<br>
Modified: lld/trunk/ELF/ICF.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ICF.cpp?rev=288409&r1=288408&r2=288409&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/lld/trunk/ELF/ICF.cpp?<wbr>rev=288409&r1=288408&r2=<wbr>288409&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- lld/trunk/ELF/ICF.cpp (original)<br>
+++ lld/trunk/ELF/ICF.cpp Thu Dec 1 13:45:22 2016<br>
@@ -7,51 +7,62 @@<br>
//<br>
//===-------------------------<wbr>------------------------------<wbr>---------------===//<br>
//<br>
-// Identical Code Folding is a feature to merge sections not by name (which<br>
-// is regular comdat handling) but by contents. If two non-writable sections<br>
-// have the same data, relocations, attributes, etc., then the two<br>
-// are considered identical and merged by the linker. This optimization<br>
-// makes outputs smaller.<br>
-//<br>
-// ICF is theoretically a problem of reducing graphs by merging as many<br>
-// identical subgraphs as possible if we consider sections as vertices and<br>
-// relocations as edges. It may sound simple, but it is a bit more<br>
-// complicated than you might think. The order of processing sections<br>
-// matters because merging two sections can make other sections, whose<br>
-// relocations now point to the same section, mergeable. Graphs may contain<br>
-// cycles. We need a sophisticated algorithm to do this properly and<br>
-// efficiently.<br>
-//<br>
-// What we do in this file is this. We split sections into groups. Sections<br>
-// in the same group are considered identical.<br>
-//<br>
-// We begin by optimistically putting all sections into a single equivalence<br>
-// class. Then we apply a series of checks that split this initial<br>
-// equivalence class into more and more refined equivalence classes based on<br>
-// the properties by which a section can be distinguished.<br>
-//<br>
-// We begin by checking that the section contents and flags are the<br>
-// same. This only needs to be done once since these properties don't depend<br>
-// on the current equivalence class assignment.<br>
-//<br>
-// Then we split the equivalence classes based on checking that their<br>
-// relocations are the same, where relocation targets are compared by their<br>
-// equivalence class, not the concrete section. This may need to be done<br>
-// multiple times because as the equivalence classes are refined, two<br>
-// sections that had a relocation target in the same equivalence class may<br>
-// now target different equivalence classes, and hence these two sections<br>
-// must be put in different equivalence classes (whereas in the previous<br>
-// iteration they were not since the relocation target was the same.)<br>
-//<br>
-// Our algorithm is smart enough to merge the following mutually-recursive<br>
-// functions.<br>
+// ICF is short for Identical Code Folding. That is a size optimization to<br>
+// identify and merge two or more read-only sections (typically functions)<br>
+// that happened to have the same contents. It usually reduces output size<br>
+// by a few percent.<br>
+//<br>
+// In ICF, two sections are considered identical if they have the same<br>
+// section flags, section data, and relocations. Relocations are tricky,<br>
+// because two relocations are considered the same if they have the same<br>
+// relocation types, values, and if they point to the same sections *in<br>
+// terms of ICF*.<br>
+//<br>
+// Here is an example. If foo and bar defined below are compiled to the<br>
+// same machine instructions, ICF can and should merge the two, although<br>
+// their relocations point to each other.<br>
//<br>
// void foo() { bar(); }<br>
// void bar() { foo(); }<br>
//<br>
-// This algorithm is so-called "optimistic" algorithm described in<br>
-// <a href="http://research.google.com/pubs/pub36912.html" rel="noreferrer" target="_blank">http://research.google.com/<wbr>pubs/pub36912.html</a>. (Note that what GNU<br>
-// gold implemented is different from the optimistic algorithm.)<br>
+// If you merge the two, their relocations point to the same section and<br>
+// thus you know they are mergeable, but how do we know they are mergeable<br>
+// in the first place? This is not an easy problem to solve.<br>
+//<br>
+// What we are doing in LLD is some sort of coloring algorithm.<br>
+//<br>
+// We color non-identical sections in different colors repeatedly.<br>
+// Sections in the same color when the algorithm terminates are considered<br>
+// identical. Here are the details:<br>
+//<br>
+// 1. First, we color all sections using their hash values of section<br>
+// types, section contents, and numbers of relocations. At this moment,<br>
+// relocation targets are not taken into account. We just color<br>
+// sections that apparently differ in different colors.<br>
+//<br>
+// 2. Next, for each color C, we visit sections in color C to compare<br>
+// relocation target colors. We recolor sections A and B in different<br>
+// colors if A's and B's relocations are different in terms of target<br>
+// colors.<br>
+//<br>
+// 3. If we recolor some section in step 2, relocations that were<br>
+// previously pointing to the same color targets may now be pointing to<br>
+// different colors. Therefore, repeat 2 until a convergence is<br>
+// obtained.<br>
+//<br>
+// 4. For each color C, pick an arbitrary section in color C, and merges<br>
+// other sections in color C with it.<br>
+//<br>
+// For small programs, this algorithm needs 3-5 iterations. For large<br>
+// programs such as Chromium, it takes more than 20 iterations.<br>
+//<br>
+// We parallelize each step so that multiple threads can work on different<br>
+// colors concurrently. That gave us a large performance boost when<br>
+// applying ICF on large programs. For example, MSVC link.exe or GNU gold<br>
+// takes 10-20 seconds to apply ICF on Chromium, whose output size is<br>
+// about 1.5 GB, but LLD can finish it in less than 2 seconds on a 2.8 GHz<br>
+// 40 core machine. Even without threading, LLD's ICF is still faster than<br>
+// MSVC or gold though.<br>
//<br>
//===-------------------------<wbr>------------------------------<wbr>---------------===//<br>
<br>
@@ -119,8 +130,7 @@ template <class ELFT> static bool isElig<br>
S->Name != ".init" && S->Name != ".fini";<br>
}<br>
<br>
-// Before calling this function, all sections in range R must have the<br>
-// same group ID.<br>
+// Split R into smaller ranges by recoloring its members.<br>
template <class ELFT> void ICF<ELFT>::segregate(Range *R, bool Constant) {<br>
// This loop rearranges sections in range R so that all sections<br>
// that are equal in terms of equals{Constant,Variable} are contiguous<br>
@@ -158,24 +168,23 @@ template <class ELFT> void ICF<ELFT>::se<br>
}<br>
R->End = Mid;<br>
<br>
- // Update GroupIds for the new group members.<br>
+ // Update the new group member colors.<br>
//<br>
- // Note on GroupId[0] and GroupId[1]: we have two storages for<br>
- // group IDs. At the beginning of each iteration of the main loop,<br>
- // both have the same ID. GroupId[0] contains the current ID, and<br>
- // GroupId[1] contains the next ID which will be used in the next<br>
- // iteration.<br>
+ // Note on Color[0] and Color[1]: we have two storages for colors.<br>
+ // At the beginning of each iteration of the main loop, both have<br>
+ // the same color. Color[0] contains the current color, and Color[1]<br>
+ // contains the next color which will be used in the next iteration.<br>
//<br>
// Recall that other threads may be working on other ranges. They<br>
- // may be reading group IDs that we are about to update. We cannot<br>
- // update group IDs in place because it breaks the invariance that<br>
- // all sections in the same group must have the same ID. In other<br>
- // words, the following for loop is not an atomic operation, and<br>
- // that is observable from other threads.<br>
+ // may be reading colors that we are about to update. We cannot<br>
+ // update colors in place because it breaks the invariance that<br>
+ // all sections in the same group must have the same color. In<br>
+ // other words, the following for loop is not an atomic operation,<br>
+ // and that is observable from other threads.<br>
//<br>
- // By writing new IDs to write-only places, we can keep the invariance.<br>
+ // By writing new colors to write-only places, we can keep the invariance.<br>
for (size_t I = Mid; I < End; ++I)<br>
- Sections[I]->GroupId[(Cnt + 1) % 2] = Id;<br>
+ Sections[I]->Color[(Cnt + 1) % 2] = Id;<br>
<br>
R = NewRange;<br>
}<br>
@@ -216,13 +225,13 @@ template <class RelTy><br>
bool ICF<ELFT>::variableEq(const InputSection<ELFT> *A, ArrayRef<RelTy> RelsA,<br>
const InputSection<ELFT> *B, ArrayRef<RelTy> RelsB) {<br>
auto Eq = [&](const RelTy &RA, const RelTy &RB) {<br>
+ // The two sections must be identical.<br>
SymbolBody &SA = A->getFile()-><wbr>getRelocTargetSym(RA);<br>
SymbolBody &SB = B->getFile()-><wbr>getRelocTargetSym(RB);<br>
if (&SA == &SB)<br>
return true;<br>
<br>
- // Or, the symbols should be pointing to the same section<br>
- // in terms of the group ID.<br>
+ // Or, the two sections must have the same color.<br>
auto *DA = dyn_cast<DefinedRegular<ELFT>><wbr>(&SA);<br>
auto *DB = dyn_cast<DefinedRegular<ELFT>><wbr>(&SB);<br>
if (!DA || !DB)<br>
@@ -234,16 +243,16 @@ bool ICF<ELFT>::variableEq(const InputSe<br>
auto *Y = dyn_cast<InputSection<ELFT>>(<wbr>DB->Section);<br>
if (!X || !Y)<br>
return false;<br>
- if (X->GroupId[Cnt % 2] == 0)<br>
+ if (X->Color[Cnt % 2] == 0)<br>
return false;<br>
<br>
// Performance hack for single-thread. If no other threads are<br>
- // running, we can safely read next GroupIDs as there is no race<br>
+ // running, we can safely read next colors as there is no race<br>
// condition. This optimization may reduce the number of<br>
// iterations of the main loop because we can see results of the<br>
// same iteration.<br>
size_t Idx = (Config->Threads ? Cnt : Cnt + 1) % 2;<br>
- return X->GroupId[Idx] == Y->GroupId[Idx];<br>
+ return X->Color[Idx] == Y->Color[Idx];<br>
};<br>
<br>
return std::equal(RelsA.begin(), RelsA.end(), RelsB.begin(), Eq);<br>
@@ -274,45 +283,45 @@ template <class ELFT> void ICF<ELFT>::ru<br>
if (isEligible(S))<br>
Sections.push_back(S);<br>
<br>
- // Initially, we use hash values as section group IDs. Therefore,<br>
- // if two sections have the same ID, they are likely (but not<br>
+ // Initially, we use hash values to color sections. Therefore, if<br>
+ // two sections have the same color, they are likely (but not<br>
// guaranteed) to have the same static contents in terms of ICF.<br>
for (InputSection<ELFT> *S : Sections)<br>
- // Set MSB to 1 to avoid collisions with non-hash IDs.<br>
- S->GroupId[0] = S->GroupId[1] = getHash(S) | (1 << 31);<br>
+ // Set MSB to 1 to avoid collisions with non-hash colors.<br>
+ S->Color[0] = S->Color[1] = getHash(S) | (1 << 31);<br>
<br>
// From now on, sections in Sections are ordered so that sections in<br>
- // the same group are consecutive in the vector.<br>
+ // the same color are consecutive in the vector.<br>
std::stable_sort(Sections.<wbr>begin(), Sections.end(),<br>
[](InputSection<ELFT> *A, InputSection<ELFT> *B) {<br>
- if (A->GroupId[0] != B->GroupId[0])<br>
- return A->GroupId[0] < B->GroupId[0];<br>
+ if (A->Color[0] != B->Color[0])<br>
+ return A->Color[0] < B->Color[0];<br>
// Within a group, put the highest alignment<br>
// requirement first, so that's the one we'll keep.<br>
return B->Alignment < A->Alignment;<br>
});<br>
<br>
- // Split sections into groups by ID. And then we are going to<br>
- // split groups into more and more smaller groups.<br>
- // Note that we do not add single element groups because they<br>
- // are already the smallest.<br>
+ // Create ranges in which each range contains sections in the same<br>
+ // color. And then we are going to split ranges into more and more<br>
+ // smaller ranges. Note that we do not add single element ranges<br>
+ // because they are already the smallest.<br>
Ranges.reserve(Sections.size()<wbr>);<br>
for (size_t I = 0, E = Sections.size(); I < E - 1;) {<br>
// Let J be the first index whose element has a different ID.<br>
size_t J = I + 1;<br>
- while (J < E && Sections[I]->GroupId[0] == Sections[J]->GroupId[0])<br>
+ while (J < E && Sections[I]->Color[0] == Sections[J]->Color[0])<br>
++J;<br>
if (J - I > 1)<br>
Ranges.push_back({I, J});<br>
I = J;<br>
}<br>
<br>
- // This function copies new GroupIds from former write-only space to<br>
- // former read-only space, so that we can flip GroupId[0] and GroupId[1].<br>
- // Note that new GroupIds are always be added to end of Ranges.<br>
+ // This function copies colors from former write-only space to former<br>
+ // read-only space, so that we can flip Color[0] and Color[1]. Note<br>
+ // that new colors are always be added to end of Ranges.<br>
auto Copy = [&](Range &R) {<br>
for (size_t I = R.Begin; I < R.End; ++I)<br>
- Sections[I]->GroupId[Cnt % 2] = Sections[I]->GroupId[(Cnt + 1) % 2];<br>
+ Sections[I]->Color[Cnt % 2] = Sections[I]->Color[(Cnt + 1) % 2];<br>
};<br>
<br>
// Compare static contents and assign unique IDs for each static content.<br>
@@ -321,7 +330,7 @@ template <class ELFT> void ICF<ELFT>::ru<br>
foreach(End, Ranges.end(), Copy);<br>
++Cnt;<br>
<br>
- // Split groups by comparing relocations until convergence is obtained.<br>
+ // Split ranges by comparing relocations until convergence is obtained.<br>
for (;;) {<br>
auto End = Ranges.end();<br>
foreach(Ranges.begin(), End, [&](Range &R) { segregate(&R, false); });<br>
@@ -334,7 +343,7 @@ template <class ELFT> void ICF<ELFT>::ru<br>
<br>
log("ICF needed " + Twine(Cnt) + " iterations");<br>
<br>
- // Merge sections in the same group.<br>
+ // Merge sections in the same colors.<br>
for (Range R : Ranges) {<br>
if (R.End - R.Begin == 1)<br>
continue;<br>
<br>
Modified: lld/trunk/ELF/InputSection.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputSection.h?rev=288409&r1=288408&r2=288409&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project/lld/trunk/ELF/<wbr>InputSection.h?rev=288409&r1=<wbr>288408&r2=288409&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- lld/trunk/ELF/InputSection.h (original)<br>
+++ lld/trunk/ELF/InputSection.h Thu Dec 1 13:45:22 2016<br>
@@ -289,7 +289,7 @@ public:<br>
void relocateNonAlloc(uint8_t *Buf, llvm::ArrayRef<RelTy> Rels);<br>
<br>
// Used by ICF.<br>
- uint32_t GroupId[2] = {0, 0};<br>
+ uint32_t Color[2] = {0, 0};<br>
<br>
// Called by ICF to merge two input sections.<br>
void replace(InputSection<ELFT> *Other);<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div></div>