[PATCH] D103292: [lld-macho] Implement ICF
Jez Ng via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 10 14:57:06 PDT 2021
int3 added a comment.
Nice work :)
================
Comment at: lld/MachO/ConcatOutputSection.cpp:23
#include <algorithm>
+#include <atomic>
----------------
needs to be moved to ICF.cpp
================
Comment at: lld/MachO/Driver.cpp:693
+static ICFLevel getICFLevel(const ArgList &args) {
+ StringRef icfLevelStr = args.getLastArgValue(OPT_icf);
----------------
should we have `-no_deduplicate` as a synonym for `-icf=none`?
================
Comment at: lld/MachO/ICF.cpp:67
+ return std::equal(ia->relocs.begin(), ia->relocs.end(), ib->relocs.begin(),
+ [&](const Reloc &ra, const Reloc &rb) {
+ if (ra.type != rb.type)
----------------
nit: would it be possible to structure the code so that clang-format doesn't insert all this left-padding?
================
Comment at: lld/MachO/ICF.cpp:98-102
+ const auto *da = dyn_cast<Defined>(sa);
+ const auto *db = dyn_cast<Defined>(sb);
+ if (da->isec->icfEquivalenceClass[icfPass % 2] !=
+ db->isec->icfEquivalenceClass[icfPass % 2])
+ return false;
----------------
I think this doesn't work with N_ALT_ENTRY either; I think we'll need to compare `Defined::value` too?
================
Comment at: lld/MachO/ICF.cpp:107
+ const auto *db = dyn_cast<DylibSymbol>(sb);
+ if (da->stubsHelperIndex != db->stubsHelperIndex)
+ return false;
----------------
do we need to compare `gotIndex` too?
================
Comment at: lld/MachO/ICF.cpp:180
+ if (auto *defined = dyn_cast<Defined>(sym))
+ hash += defined->isec->icfEquivalenceClass[icfPass % 2];
+ else if (auto *dylibSym = dyn_cast<DylibSymbol>(sym))
----------------
ditto regarding N_ALT_ENTRY
================
Comment at: lld/MachO/ICF.cpp:230-231
+
+ // Split [Begin, End) into [Begin, Mid) and [Mid, End). We use Mid as an
+ // equivalence class ID because every group ends with a unique index.
+ for (size_t i = begin; i < mid; ++i)
----------------
can `Mid` ever clash with `icfInvalidId`?
================
Comment at: lld/MachO/InputFiles.cpp:622-628
+ if (isZeroFill(isec->flags)) {
+ nextIsec->data = {nullptr, isec->data.size() - symbolOffset};
+ isec->data = {nullptr, symbolOffset};
+ } else {
+ nextIsec->data = isec->data.slice(symbolOffset);
+ isec->data = isec->data.slice(0, symbolOffset);
+ }
----------------
gkm wrote:
> int3 wrote:
> > why is this change necessary?
> Not strictly necessary. I saw SEGVs when checking `isec->data.data() == nullptr` because `parseSymbols` would advance the initial `nullptr` for a zero-fill section at every split. The alternative is to designate `data.data()` as undefined for zero-fill sections.
oh that makes sense. Could you leave a comment to that effect?
================
Comment at: lld/MachO/InputSection.cpp:60-61
+ case S_REGULAR:
+ if (parent == textOutputSection)
+ return !in.unwindInfo->funcHasPersonality(this);
+ // One might hope that we could hash __TEXT,__const subsections to fold
----------------
Passing `textOutputSection` here just for this check seems pretty awkward to me... how about adding a `hasPersonality` bit on `InputSection` that gets set directly by UnwindInfoSection? we could make `InputSection::live` a 1-bit value avoid taking up extra space
================
Comment at: lld/MachO/InputSection.cpp:100
+ // Turn-on the top bit to guarantee that valid hashes have no
+ // collisions with the small-integer unique IDs for ICF-inelgible sections
+ icfEquivalenceClass[0] = xxHash64(data) | (1ull << 63);
----------------
================
Comment at: lld/MachO/InputSection.cpp:108-109
+ copy->wasCoalesced = true;
+ copy->numRefs--;
+ numRefs++;
+ copy->replacement = this;
----------------
I think this should be right since all the symbol references will now be transferred away from the copy
would be good to include a test with N_ALT_ENTRY to check we get this right
================
Comment at: lld/MachO/InputSection.h:68
+ // Equivalence-class ID for ICF
+ uint64_t icfEquivalenceClass[2] = {0, 0};
+
----------------
looks like LLD-ELF uses 32 bits for these hashes, we might want to consider doing the same if there's a perf win (no need to sort it out in this diff though)
================
Comment at: lld/MachO/Symbols.cpp:45
+ return used;
+ InputSection *isec = d->isec->replacement ? d->isec->replacement : d->isec;
+ return isec->live;
----------------
how about adding a method like `InputSection::canonical()` that returns the replacement as necessary?
================
Comment at: lld/MachO/UnwindInfoSection.cpp:160
+ rFunc.offset + offsetof(CompactUnwindEntry<Ptr>, personality));
+ if (const auto *fIsec = rFunc.referent.get<InputSection *>())
+ funcsWithPersonality.insert(fIsec);
----------------
should `get<InputSection *>()` ever return null? If `rFunc` was a symbol relocation, we would have to track the corresponding InputSection it belonged to, right?
================
Comment at: lld/MachO/UnwindInfoSection.h:36
virtual void prepareRelocations(InputSection *) = 0;
+ virtual bool funcHasPersonality(const InputSection *) = 0;
----------------
nit: this doesn't need to be virtual, only the templated methods need to be
================
Comment at: lld/MachO/Writer.cpp:945-946
+ size_t newSize = textOutputSection->inputs.size();
+ log("ICF kept " + std::to_string(newSize) + " removed " +
+ std::to_string(oldSize - newSize) + " of " + std::to_string(oldSize));
+}
----------------
nit: s/std::to_string/Twine/
================
Comment at: lld/test/MachO/icf-scale.s:10
+## it uses a sequential algorithm to avoid the overhead of threading.
+## At 1 Ki functions or more, when threading begins to pay-off ICF employs its
+## parallel segregation algorithm. Here we generate 4 Ki functions to exercise
----------------
================
Comment at: lld/test/MachO/icf-scale.s:12
+## parallel segregation algorithm. Here we generate 4 Ki functions to exercise
+## the parallel algorithm. There are 4 uniqe function bodies, each replicated
+## 1 Ki times. The resulting folded program should one instance for each of the
----------------
================
Comment at: lld/test/MachO/icf-scale.s:13
+## the parallel algorithm. There are 4 uniqe function bodies, each replicated
+## 1 Ki times. The resulting folded program should one instance for each of the
+## four unique functions.
----------------
================
Comment at: lld/test/MachO/icf-scale.s:21
+# CHECK: [[#%x,G3:]] g F __TEXT,__text _g300000
+## . . .
+# CHECK: [[#%x,G0]] g F __TEXT,__text _g033333
----------------
not the most descriptive comment :p
================
Comment at: lld/test/MachO/icf.s:19
+# CHECK: [[#%x,MR]] g F __TEXT,__text _mr2
+### Note that "BROKE" is not a FileCheck keyword, just something it ignores
+# BROKE: [[#%x,XR:]] g F __TEXT,__text _xr1
----------------
how about using `COM:`? https://llvm.org/docs/CommandGuide/FileCheck.html#the-com-directive
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103292/new/
https://reviews.llvm.org/D103292
More information about the llvm-commits
mailing list