[lld] r247387 - COFF: Teach ICF to merge cyclic graphs.
Sean Silva via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 11 19:58:39 PDT 2015
That's unfortunate.
On Fri, Sep 11, 2015 at 7:37 PM, Rui Ueyama <ruiu at google.com> wrote:
> I spend some time today to re-implement the feature using scc_iterator
> with the hope that simplifies the code, but looks like scc_iterator does
> not work well for this ICF.
>
> Defining GraphTraits for SectionChunks needed almost 100 lines of code
> (eventually it'd need even more, but I don't know because I gave up before
> finishing that). Also the notion of EntryNode exists in GraphTraits does
> not exist in SectionChunks -- we just have an array of SectionChunks, which
> contains more than one graphs, which don't have notion of entry or exit
> nodes. So I think we should keep this code as-is.
>
> On Thu, Sep 10, 2015 at 10:14 PM, Rui Ueyama <ruiu at google.com> wrote:
>
>> Maybe not. scc_iterator seems to fit here nicely. I'll update the code.
>>
>> On Thu, Sep 10, 2015 at 10:00 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>> Is there a reason you're not using scc_iterator?
>>>
>>> -- Sean Silva
>>>
>>> On Thu, Sep 10, 2015 at 9:29 PM, Rui Ueyama via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>>> Author: ruiu
>>>> Date: Thu Sep 10 23:29:03 2015
>>>> New Revision: 247387
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=247387&view=rev
>>>> Log:
>>>> COFF: Teach ICF to merge cyclic graphs.
>>>>
>>>> Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate
>>>> because, in COFF, cyclic graphs are not exceptional at all. That is
>>>> pretty common.
>>>>
>>>> In this patch, sections are grouped by Tarjan's strongly connected
>>>> component algorithm to get acyclic graphs. And then we try to merge
>>>> SCCs whose outdegree is zero, and remove them from the graph. This
>>>> makes other SCCs to have outdegree zero, so we can repeat the
>>>> process until all SCCs are removed. When comparing two SCCs, we handle
>>>> cycles properly.
>>>>
>>>> This algorithm works better than previous one. Previously, self-linking
>>>> produced a 29.0MB executable. It now produces a 27.7MB. There's still
>>>> some
>>>> gap compared to MSVC linker which produces a 27.1MB executable for the
>>>> same input. So the gap is narrowed, but still LLD is not on par with
>>>> MSVC.
>>>> I'll investigate that later.
>>>>
>>>> Modified:
>>>> lld/trunk/COFF/Chunks.h
>>>> lld/trunk/COFF/ICF.cpp
>>>> lld/trunk/test/COFF/icf-circular.test
>>>>
>>>> Modified: lld/trunk/COFF/Chunks.h
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Chunks.h?rev=247387&r1=247386&r2=247387&view=diff
>>>>
>>>> ==============================================================================
>>>> --- lld/trunk/COFF/Chunks.h (original)
>>>> +++ lld/trunk/COFF/Chunks.h Thu Sep 10 23:29:03 2015
>>>> @@ -111,6 +111,18 @@ protected:
>>>> uint32_t Align = 1;
>>>> };
>>>>
>>>> +class SectionChunk;
>>>> +
>>>> +// A container of SectionChunks. Used by ICF to store computation
>>>> +// results of strongly connected components. You can ignore this
>>>> +// unless you are interested in ICF.
>>>> +struct Component {
>>>> + Component(std::vector<SectionChunk *> V) : Members(V) {}
>>>> + std::vector<SectionChunk *> Members;
>>>> + std::vector<Component *> Predecessors;
>>>> + int Outdegree = 0;
>>>> +};
>>>> +
>>>> // A chunk corresponding a section of an input file.
>>>> class SectionChunk : public Chunk {
>>>> public:
>>>> @@ -182,12 +194,15 @@ public:
>>>> // with other chunk by ICF, it points to another chunk,
>>>> // and this chunk is considrered as dead.
>>>> SectionChunk *Ptr;
>>>> - int Outdegree = 0;
>>>> - std::vector<SectionChunk *> Ins;
>>>> + uint32_t Index = 0;
>>>> + uint32_t LowLink = 0;
>>>> + bool OnStack = false;
>>>> + Component *SCC = nullptr;
>>>>
>>>> // The CRC of the contents as described in the COFF spec 4.5.5.
>>>> // Auxiliary Format 5: Section Definitions. Used for ICF.
>>>> uint32_t Checksum = 0;
>>>> + mutable uint64_t Hash = 0;
>>>>
>>>> private:
>>>> ArrayRef<uint8_t> getContents() const;
>>>>
>>>> Modified: lld/trunk/COFF/ICF.cpp
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=247387&r1=247386&r2=247387&view=diff
>>>>
>>>> ==============================================================================
>>>> --- lld/trunk/COFF/ICF.cpp (original)
>>>> +++ lld/trunk/COFF/ICF.cpp Thu Sep 10 23:29:03 2015
>>>> @@ -7,7 +7,31 @@
>>>> //
>>>>
>>>> //===----------------------------------------------------------------------===//
>>>> //
>>>> -// Implements ICF (Identical COMDAT Folding)
>>>> +// Identical COMDAT Folding is a feature to merge COMDAT sections not
>>>> by
>>>> +// name (which is regular COMDAT handling) but by contents. If two
>>>> COMDAT
>>>> +// sections have the same data, relocations, attributes, etc., then
>>>> the two
>>>> +// are considered identical and merged by the linker. This optimization
>>>> +// makes outputs smaller.
>>>> +//
>>>> +// ICF is theoretically a problem of reducing graphs by merging as many
>>>> +// isomorphic subgraphs as possible, if we consider sections as
>>>> vertices and
>>>> +// relocations as edges. This may be a bit more complicated problem
>>>> than you
>>>> +// might think. The order of processing sections matters since merging
>>>> two
>>>> +// sections can make other sections, whose relocations now point to the
>>>> +// section, mergeable. Graphs may contain cycles, which is common in
>>>> COFF.
>>>> +// We need a sophisticated algorithm to do this properly and
>>>> efficiently.
>>>> +//
>>>> +// What we do in this file is this. We first compute strongly connected
>>>> +// components of the graphs to get acyclic graphs. Then, we remove
>>>> SCCs whose
>>>> +// outdegree is zero from the graphs and try to merge them. This
>>>> operation
>>>> +// makes other SCCs to have outdegree zero, so we repeat the process
>>>> until
>>>> +// all SCCs are removed.
>>>> +//
>>>> +// This algorithm is different from what GNU gold does which is
>>>> described in
>>>> +// http://research.google.com/pubs/pub36912.html. I don't know which
>>>> is
>>>> +// faster, this or Gold's, in practice. It'd be interesting to
>>>> implement the
>>>> +// other algorithm to compare. Note that the gold's algorithm cannot
>>>> handle
>>>> +// cycles, so we need to tweak it, though.
>>>> //
>>>>
>>>> //===----------------------------------------------------------------------===//
>>>>
>>>> @@ -15,6 +39,10 @@
>>>> #include "Symbols.h"
>>>> #include "llvm/ADT/Hashing.h"
>>>> #include "llvm/ADT/STLExtras.h"
>>>> +#include "llvm/Support/Debug.h"
>>>> +#include "llvm/Support/raw_ostream.h"
>>>> +#include <algorithm>
>>>> +#include <functional>
>>>> #include <tuple>
>>>> #include <unordered_set>
>>>> #include <vector>
>>>> @@ -37,15 +65,156 @@ struct Equals {
>>>>
>>>> } // anonymous namespace
>>>>
>>>> +// Invoke Fn for each live COMDAT successor sections of SC.
>>>> +static void forEach(SectionChunk *SC, std::function<void(SectionChunk
>>>> *)> Fn) {
>>>> + for (SectionChunk *C : SC->children())
>>>> + Fn(C);
>>>> + for (SymbolBody *B : SC->symbols()) {
>>>> + if (auto *D = dyn_cast<DefinedRegular>(B)) {
>>>> + SectionChunk *C = D->getChunk();
>>>> + if (C->isCOMDAT() && C->isLive())
>>>> + Fn(C);
>>>> + }
>>>> + }
>>>> +}
>>>> +
>>>> +typedef std::vector<Component *>::iterator ComponentIterator;
>>>> +
>>>> +// Try to merge two SCCs, A and B. A and B are likely to be isomorphic
>>>> +// because all sections have the same hash values.
>>>> +static void tryMerge(std::vector<SectionChunk *> &A,
>>>> + std::vector<SectionChunk *> &B) {
>>>> + // Assume that relocation targets are the same.
>>>> + size_t End = A.size();
>>>> + for (size_t I = 0; I != End; ++I) {
>>>> + assert(B[I] == B[I]->Ptr);
>>>> + B[I]->Ptr = A[I];
>>>> + }
>>>> + for (size_t I = 0; I != End; ++I) {
>>>> + if (A[I]->equals(B[I]))
>>>> + continue;
>>>> + // If we reach here, the assumption was wrong. Reset the pointers
>>>> + // to the original values and terminate the comparison.
>>>> + for (size_t I = 0; I != End; ++I)
>>>> + B[I]->Ptr = B[I];
>>>> + return;
>>>> + }
>>>> + // If we reach here, the assumption was correct. Actually replace
>>>> them.
>>>> + for (size_t I = 0; I != End; ++I)
>>>> + B[I]->replaceWith(A[I]);
>>>> +}
>>>> +
>>>> +// Try to merge components. All components given to this function are
>>>> +// guaranteed to have the same number of members.
>>>> +static void doUniquefy(ComponentIterator Begin, ComponentIterator End)
>>>> {
>>>> + // Sort component members by hash value.
>>>> + for (auto It = Begin; It != End; ++It) {
>>>> + Component *SCC = *It;
>>>> + auto Comp = [](SectionChunk *A, SectionChunk *B) {
>>>> + return A->getHash() < B->getHash();
>>>> + };
>>>> + std::sort(SCC->Members.begin(), SCC->Members.end(), Comp);
>>>> + }
>>>> +
>>>> + // Merge as much component members as possible.
>>>> + for (auto It = Begin; It != End;) {
>>>> + Component *SCC = *It;
>>>> + auto Bound = std::partition(It + 1, End, [&](Component *C) {
>>>> + for (size_t I = 0, E = SCC->Members.size(); I != E; ++I)
>>>> + if (SCC->Members[I]->getHash() != C->Members[I]->getHash())
>>>> + return false;
>>>> + return true;
>>>> + });
>>>> +
>>>> + // Components [I, Bound) are likely to have the same members
>>>> + // because all members have the same hash values. Verify that.
>>>> + for (auto I = It + 1; I != Bound; ++I)
>>>> + tryMerge(SCC->Members, (*I)->Members);
>>>> + It = Bound;
>>>> + }
>>>> +}
>>>> +
>>>> +static void uniquefy(ComponentIterator Begin, ComponentIterator End) {
>>>> + for (auto It = Begin; It != End;) {
>>>> + Component *SCC = *It;
>>>> + size_t Size = SCC->Members.size();
>>>> + auto Bound = std::partition(It + 1, End, [&](Component *C) {
>>>> + return C->Members.size() == Size;
>>>> + });
>>>> + doUniquefy(It, Bound);
>>>> + It = Bound;
>>>> + }
>>>> +}
>>>> +
>>>> +// Returns strongly connected components of the graph formed by Chunks.
>>>> +// Chunks (a list of Live COMDAT sections) are considred as vertices,
>>>> +// and their relocations or association are considered as edges.
>>>> +static std::vector<Component *>
>>>> +getSCC(const std::vector<SectionChunk *> &Chunks) {
>>>> + std::vector<Component *> Ret;
>>>> + std::vector<SectionChunk *> V;
>>>> + uint32_t Idx;
>>>> +
>>>> + std::function<void(SectionChunk *)> StrongConnect = [&](SectionChunk
>>>> *SC) {
>>>> + SC->Index = SC->LowLink = Idx++;
>>>> + size_t Curr = V.size();
>>>> + V.push_back(SC);
>>>> + SC->OnStack = true;
>>>> +
>>>> + forEach(SC, [&](SectionChunk *C) {
>>>> + if (C->Index == 0) {
>>>> + StrongConnect(C);
>>>> + SC->LowLink = std::min(SC->LowLink, C->LowLink);
>>>> + } else if (C->OnStack) {
>>>> + SC->LowLink = std::min(SC->LowLink, C->Index);
>>>> + }
>>>> + });
>>>> +
>>>> + if (SC->LowLink != SC->Index)
>>>> + return;
>>>> + auto *SCC = new Component(
>>>> + std::vector<SectionChunk *>(V.begin() + Curr, V.end()));
>>>> + for (size_t I = Curr, E = V.size(); I != E; ++I) {
>>>> + V[I]->OnStack = false;
>>>> + V[I]->SCC = SCC;
>>>> + }
>>>> + Ret.push_back(SCC);
>>>> + V.erase(V.begin() + Curr, V.end());
>>>> + };
>>>> +
>>>> + for (SectionChunk *SC : Chunks) {
>>>> + if (SC->Index == 0) {
>>>> + Idx = 1;
>>>> + StrongConnect(SC);
>>>> + }
>>>> + }
>>>> +
>>>> + for (Component *SCC : Ret) {
>>>> + for (SectionChunk *SC : SCC->Members) {
>>>> + forEach(SC, [&](SectionChunk *C) {
>>>> + if (SCC == C->SCC)
>>>> + return;
>>>> + ++SCC->Outdegree;
>>>> + C->SCC->Predecessors.push_back(SCC);
>>>> + });
>>>> + }
>>>> + }
>>>> + return Ret;
>>>> +}
>>>> +
>>>> uint64_t SectionChunk::getHash() const {
>>>> - return hash_combine(getPermissions(),
>>>> - hash_value(SectionName),
>>>> - NumRelocs,
>>>> - uint32_t(Header->SizeOfRawData),
>>>> - std::distance(Relocs.end(), Relocs.begin()),
>>>> - Checksum);
>>>> + if (Hash == 0) {
>>>> + Hash = hash_combine(getPermissions(),
>>>> + hash_value(SectionName),
>>>> + NumRelocs,
>>>> + uint32_t(Header->SizeOfRawData),
>>>> + std::distance(Relocs.end(), Relocs.begin()),
>>>> + Checksum);
>>>> + }
>>>> + return Hash;
>>>> }
>>>>
>>>> +
>>>> // Returns true if this and a given chunk are identical COMDAT
>>>> sections.
>>>> bool SectionChunk::equals(const SectionChunk *X) const {
>>>> // Compare headers
>>>> @@ -90,28 +259,6 @@ bool SectionChunk::equals(const SectionC
>>>> return std::equal(Relocs.begin(), Relocs.end(), X->Relocs.begin(),
>>>> Eq);
>>>> }
>>>>
>>>> -static void link(SectionChunk *From, SectionChunk *To) {
>>>> - ++From->Outdegree;
>>>> - To->Ins.push_back(From);
>>>> -}
>>>> -
>>>> -typedef std::vector<SectionChunk *>::iterator ChunkIterator;
>>>> -
>>>> -static void uniquefy(ChunkIterator Begin, ChunkIterator End) {
>>>> - std::unordered_set<SectionChunk *, Hasher, Equals> Set;
>>>> - for (auto It = Begin; It != End; ++It) {
>>>> - SectionChunk *SC = *It;
>>>> - auto P = Set.insert(SC);
>>>> - bool Inserted = P.second;
>>>> - if (Inserted)
>>>> - continue;
>>>> - SectionChunk *Existing = *P.first;
>>>> - SC->replaceWith(Existing);
>>>> - for (SectionChunk *In : SC->Ins)
>>>> - --In->Outdegree;
>>>> - }
>>>> -}
>>>> -
>>>> // Merge identical COMDAT sections.
>>>> // Two sections are considered the same if their section headers,
>>>> // contents and relocations are all the same.
>>>> @@ -122,26 +269,19 @@ void doICF(const std::vector<Chunk *> &C
>>>> if (SC->isCOMDAT() && SC->isLive())
>>>> SChunks.push_back(SC);
>>>>
>>>> - // Initialize SectionChunks' outdegrees and in-chunk lists.
>>>> - for (SectionChunk *SC : SChunks) {
>>>> - for (SectionChunk *C : SC->children())
>>>> - link(SC, C);
>>>> - for (SymbolBody *B : SC->symbols())
>>>> - if (auto *D = dyn_cast<DefinedRegular>(B))
>>>> - link(SC, D->getChunk());
>>>> - }
>>>> -
>>>> - // By merging two sections, more sections can become mergeable
>>>> - // because two originally different relocations can now point to
>>>> - // the same section. We process sections whose outdegree is zero
>>>> - // first to deal with that.
>>>> - auto Pred = [](SectionChunk *SC) { return SC->Outdegree > 0; };
>>>> - for (;;) {
>>>> - auto Bound = std::partition(SChunks.begin(), SChunks.end(), Pred);
>>>> - if (Bound == SChunks.end())
>>>> - return;
>>>> - uniquefy(Bound, SChunks.end());
>>>> - SChunks.erase(Bound, SChunks.end());
>>>> + std::vector<Component *> Components = getSCC(SChunks);
>>>> +
>>>> + while (Components.size() > 0) {
>>>> + auto Bound = std::partition(Components.begin(), Components.end(),
>>>> + [](Component *SCC) { return
>>>> SCC->Outdegree > 0; });
>>>> + uniquefy(Bound, Components.end());
>>>> +
>>>> + for (auto It = Bound, E = Components.end(); It != E; ++It) {
>>>> + Component *SCC = *It;
>>>> + for (Component *Pred : SCC->Predecessors)
>>>> + --Pred->Outdegree;
>>>> + }
>>>> + Components.erase(Bound, Components.end());
>>>> }
>>>> }
>>>>
>>>>
>>>> Modified: lld/trunk/test/COFF/icf-circular.test
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/icf-circular.test?rev=247387&r1=247386&r2=247387&view=diff
>>>>
>>>> ==============================================================================
>>>> --- lld/trunk/test/COFF/icf-circular.test (original)
>>>> +++ lld/trunk/test/COFF/icf-circular.test Thu Sep 10 23:29:03 2015
>>>> @@ -3,7 +3,7 @@
>>>> # RUN: /opt:lldicf /verbose %t.obj > %t.log 2>&1
>>>> # RUN: FileCheck %s < %t.log
>>>>
>>>> -# CHECK-NOT: Replaced bar
>>>> +# CHECK: Replaced bar
>>>>
>>>> ---
>>>> header:
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150911/31a3b28a/attachment.html>
More information about the llvm-commits
mailing list