[PATCH] D103292: [lld-macho] Implement ICF

Greg McGary via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 11 13:33:46 PDT 2021


gkm added inline comments.


================
Comment at: lld/MachO/Driver.cpp:693
 
+static ICFLevel getICFLevel(const ArgList &args) {
+  StringRef icfLevelStr = args.getLastArgValue(OPT_icf);
----------------
int3 wrote:
> should we have `-no_deduplicate` as a synonym for `-icf=none`?
Are you proposing that `-no_deduplicate` be a strict synonym for `-icf=none`, or that it disable all forms of deduplication, including `--deduplicate_literals` ?


================
Comment at: lld/MachO/ICF.cpp:107
+                          const auto *db = dyn_cast<DylibSymbol>(sb);
+                          if (da->stubsHelperIndex != db->stubsHelperIndex)
+                            return false;
----------------
int3 wrote:
> int3 wrote:
> > do we need to compare `gotIndex` too?
> actually... since ICF doesn't fold DylibSymbols, can we not just check for pointer equality of the DylibSymbols here?
We already checked `(ra.referent == rb.referent)` above, so the `isa<DylibSymbol>()` block is dead code. Good catch!


================
Comment at: lld/MachO/ICF.cpp:230-231
+
+    // Split [Begin, End) into [Begin, Mid) and [Mid, End). We use Mid as an
+    // equivalence class ID because every group ends with a unique index.
+    for (size_t i = begin; i < mid; ++i)
----------------
int3 wrote:
> can `Mid` ever clash with `icfInvalidId`?
[ FYI, I also did this: s/`icfInvalidId`/`icfUniqueID`/ ]

The unique IDs and hashes are all replaced with equivalence-class IDs by the first run of `segregate`. Although the unique ID space does overlap the equivalence-class ID space, they never coexist. Even so, for sake of debugging years ago, it might have been nice to begin the `icfUniqueID` space at `inputSections.size()`. I'll do it anyway, even though it ought have no effect.


================
Comment at: lld/MachO/InputSection.cpp:60-61
+  case S_REGULAR:
+    if (parent == textOutputSection)
+      return !in.unwindInfo->funcHasPersonality(this);
+    // One might hope that we could hash __TEXT,__const subsections to fold
----------------
int3 wrote:
> Passing `textOutputSection` here just for this check seems pretty awkward to me... how about adding a `hasPersonality` bit on `InputSection` that gets set directly by UnwindInfoSection? we could make `InputSection::live` a 1-bit value avoid taking up extra space
There were already 3 `bool`s. I rearranged slightly so that now 3+1 `bool`s are adjacent.

I still need to reject `S_REGULAR` sections that are not in `__TEXT,__text`. I think I made it a bit less awkward by passing the `isText` bool rather than `textOutputSection`.


================
Comment at: lld/MachO/InputSection.h:68
+  // Equivalence-class ID for ICF
+  uint64_t icfEquivalenceClass[2] = {0, 0};
+
----------------
int3 wrote:
> looks like LLD-ELF uses 32 bits for these hashes, we might want to consider doing the same if there's a perf win (no need to sort it out in this diff though)
I wonder if truncating hashes to 32 bits was a legacy choice from the time when people still ran LLVM on 32-bit hosts. There's also space-savings of 8 bytes per `InputSection` to consider. I will benchmark to see if there is any benefit for modern 64-bit hosts.


================
Comment at: lld/MachO/UnwindInfoSection.cpp:160
+           rFunc.offset + offsetof(CompactUnwindEntry<Ptr>, personality));
+    if (const auto *fIsec = rFunc.referent.get<InputSection *>())
+      funcsWithPersonality.insert(fIsec);
----------------
int3 wrote:
> should `get<InputSection *>()` ever return null? If `rFunc` was a symbol relocation, we would have to track the corresponding InputSection it belonged to, right?
These are always section relocs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103292/new/

https://reviews.llvm.org/D103292



More information about the llvm-commits mailing list