[PATCH] D103292: [lld-macho] Implement ICF

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 7 19:16:03 PDT 2021


int3 added a comment.

Haven't dug into the details yet, but a few high-level thoughts:

- The logging code is fairly extensive. Is it really necessary to commit it? If we are leaving it in, could we perhaps cut out InputSection::defined and symbolicate addresses by doing a linear/binary search for the matching symbol?
- Relatedly: can we have a perf benchmark between LLD-before-this-diff and LLD-with-this-diff-and-ICF-none? I'm wondering if the increased size of the InputSection will cost us.
- "Folding cannot occur across output-section boundaries, therefore ConcatOutputSection is the natural place to run ICF." -- is this not the case for ELF and COFF? Personally I would prefer keeping "core data types" like ConcatOutputSection as simple as possible, as well as to mirror the ELF/COFF ports more closely.



================
Comment at: lld/MachO/ConcatOutputSection.cpp:386
+// the application after dyld applies fixups to pointer data. Some sections are
+// deduplicated elsewhere (__DATA_CONST,__cfstring), and some are synthetic
+// (__DATA_CONST,__got). There are no ICF opportunities here.
----------------
I was hoping that `__cfstring` would be handled seamlessly by ICF once it's integrated with literal merging. (And I think the current implementation should be able to handle it.)


================
Comment at: lld/MachO/InputFiles.cpp:622-628
+      if (isZeroFill(isec->flags)) {
+        nextIsec->data = {nullptr, isec->data.size() - symbolOffset};
+        isec->data = {nullptr, symbolOffset};
+      } else {
+        nextIsec->data = isec->data.slice(symbolOffset);
+        isec->data = isec->data.slice(0, symbolOffset);
+      }
----------------
why is this change necessary?


================
Comment at: lld/MachO/InputSection.h:57
+  // Equivalence-class ID for ICF
+  uint64_t icfEquivalenceClass[2] = {0, 0};
+  // Track the defined symbol that references this - for ICF logs
----------------
I roughly get why we have two entries here, but a comment explaining it would be nice


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103292/new/

https://reviews.llvm.org/D103292



More information about the llvm-commits mailing list