[lld] r246878 - COFF: Implement a better algorithm for ICF.
Rui Ueyama via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 4 23:39:13 PDT 2015
Yup, that's an idea I want to try to implement.
On Fri, Sep 4, 2015 at 5:33 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Fri, Sep 4, 2015 at 2:35 PM, Rui Ueyama via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> Author: ruiu
>> Date: Fri Sep 4 16:35:54 2015
>> New Revision: 246878
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=246878&view=rev
>> Log:
>> COFF: Implement a better algorithm for ICF.
>>
>> Identical COMDAT Folding is a feature to merge COMDAT sections
>> by contents. Two sections are considered the same if their contents,
>> relocations, attributes, etc, are all the same.
>>
>> An interesting fact is that MSVC linker takes "iterations" parameter
>> for ICF because the algorithm they are using is iterative. Merging
>> two sections could make more sections to be mergeable because
>> different relocations could now point to the same section. ICF is
>> repeated until we get a convergence (until no section can be merged).
>> This algorithm is not fast. Usually it needs three iterations until a
>> convergence is obtained.
>>
>> In the new algorithm implemented in this patch, we consider sections
>> and relocations as a directed acyclic graph, and we try to merge
>> sections whose outdegree is zero. Sections with outdegree zero are then
>> removed from the graph, which makes other sections to have outdegree
>> zero. We repeat that until all sections are processed. In this
>> algorithm, we don't iterate over the same sections many times.
>>
>> There's an apparent issue in the algorithm -- the section graph is
>> not guaranteed to be acyclic. It's actually pretty often cyclic.
>> So this algorithm cannot eliminate all possible duplicates.
>> That's OK for now because the previous algorithm was not able to
>> eliminate cycles too. I'll address the issue in a follow-up patch.
>>
>
> If you collapse SCC's, then the graph is guaranteed to be acyclic. Then
> you only need to solve how to compare SCC's.
>
> -- Sean Silva
>
>
>>
>> Modified:
>> lld/trunk/COFF/Chunks.h
>> lld/trunk/COFF/ICF.cpp
>>
>> Modified: lld/trunk/COFF/Chunks.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Chunks.h?rev=246878&r1=246877&r2=246878&view=diff
>>
>> ==============================================================================
>> --- lld/trunk/COFF/Chunks.h (original)
>> +++ lld/trunk/COFF/Chunks.h Fri Sep 4 16:35:54 2015
>> @@ -182,6 +182,8 @@ public:
>> // with other chunk by ICF, it points to another chunk,
>> // and this chunk is considrered as dead.
>> SectionChunk *Ptr;
>> + int Outdegree = 0;
>> + std::vector<SectionChunk *> Ins;
>>
>> // The CRC of the contents as described in the COFF spec 4.5.5.
>> // Auxiliary Format 5: Section Definitions. Used for ICF.
>>
>> Modified: lld/trunk/COFF/ICF.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=246878&r1=246877&r2=246878&view=diff
>>
>> ==============================================================================
>> --- lld/trunk/COFF/ICF.cpp (original)
>> +++ lld/trunk/COFF/ICF.cpp Fri Sep 4 16:35:54 2015
>> @@ -90,31 +90,53 @@ bool SectionChunk::equals(const SectionC
>> return std::equal(Relocs.begin(), Relocs.end(), X->Relocs.begin(), Eq);
>> }
>>
>> +static void link(SectionChunk *From, SectionChunk *To) {
>> + ++From->Outdegree;
>> + To->Ins.push_back(From);
>> +}
>> +
>> // Merge identical COMDAT sections.
>> -// Two sections are considered as identical when their section headers,
>> +// Two sections are considered the same if their section headers,
>> // contents and relocations are all the same.
>> void doICF(const std::vector<Chunk *> &Chunks) {
>> - std::unordered_set<SectionChunk *, Hasher, Equals> Set;
>> - bool Redo;
>> - do {
>> - Set.clear();
>> - Redo = false;
>> - for (Chunk *C : Chunks) {
>> - auto *SC = dyn_cast<SectionChunk>(C);
>> - if (!SC || !SC->isCOMDAT() || !SC->isLive())
>> - continue;
>> + std::vector<SectionChunk *> SChunks;
>> + for (Chunk *C : Chunks)
>> + if (auto *SC = dyn_cast<SectionChunk>(C))
>> + if (SC->isCOMDAT() && SC->isLive())
>> + SChunks.push_back(SC);
>> +
>> + // Initialize SectionChunks' outdegrees and in-chunk lists.
>> + for (SectionChunk *SC : SChunks) {
>> + for (SectionChunk *C : SC->children())
>> + link(SC, C);
>> + for (SymbolBody *B : SC->symbols())
>> + if (auto *D = dyn_cast<DefinedRegular>(B))
>> + link(SC, D->getChunk());
>> + }
>> +
>> + // By merging two sections, more sections can become mergeable
>> + // because two originally different relocations can now point to
>> + // the same section. We process sections whose outdegree is zero
>> + // first to deal with that.
>> + for (;;) {
>> + std::unordered_set<SectionChunk *, Hasher, Equals> Set;
>> + auto Pred = [](SectionChunk *SC) { return SC->Outdegree > 0; };
>> + auto Bound = std::partition(SChunks.begin(), SChunks.end(), Pred);
>> + if (Bound == SChunks.end())
>> + return;
>> + for (auto It = Bound, E = SChunks.end(); It != E; ++It) {
>> + SectionChunk *SC = *It;
>> auto P = Set.insert(SC);
>> bool Inserted = P.second;
>> if (Inserted)
>> continue;
>> SectionChunk *Existing = *P.first;
>> SC->replaceWith(Existing);
>> - // By merging sections, two relocations that originally pointed to
>> - // different locations can now point to the same location.
>> - // So, repeat the process until a convegence is obtained.
>> - Redo = true;
>> + for (SectionChunk *In : SC->Ins)
>> + --In->Outdegree;
>> }
>> - } while (Redo);
>> + SChunks.erase(Bound, SChunks.end());
>> + }
>> }
>>
>> } // namespace coff
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150904/5a37e950/attachment.html>
More information about the llvm-commits
mailing list