[PATCH] D103292: [lld-macho] Implement ICF
Greg McGary via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 8 09:48:56 PDT 2021
gkm marked 2 inline comments as done.
gkm added a comment.
In D103292#2804387 <https://reviews.llvm.org/D103292#2804387>, @int3 wrote:
> - The logging code is fairly extensive. Is it really necessary to commit it? If we are leaving it in, could we perhaps cut out InputSection::defined and symbolicate addresses by doing a linear/binary search for the matching symbol?
I can remove the logging and retain it as a patch if/when needed.
> - Relatedly: can we have a perf benchmark between LLD-before-this-diff and LLD-with-this-diff-and-ICF-none? I'm wondering if the increased size of the InputSection will cost us.
Yes.
> - "Folding cannot occur across output-section boundaries, therefore ConcatOutputSection is the natural place to run ICF." -- is this not the case for ELF and COFF? Personally I would prefer keeping "core data types" like ConcatOutputSection as simple as possible, as well as to mirror the ELF/COFF ports more closely.
ELF & COFF also fold only within output sections, but the segregation algorithm runs on all input sections in one batch, so output-section boundaries are enforced explicitly during segregation. Output-section boundaries need not be checked for MachO, because we have already sorted all input sections into `ConcatOutputSection`s. Perhaps I should just remove the comment because it is a minor point and might serve only to mislead.
================
Comment at: lld/MachO/ConcatOutputSection.cpp:386
+// the application after dyld applies fixups to pointer data. Some sections are
+// deduplicated elsewhere (__DATA_CONST,__cfstring), and some are synthetic
+// (__DATA_CONST,__got). There are no ICF opportunities here.
----------------
int3 wrote:
> I was hoping that `__cfstring` would be handled seamlessly by ICF once it's integrated with literal merging. (And I think the current implementation should be able to handle it.)
We can experiment with ICF on only `__TEXT,__text` and `__DATA_CONST,__cfstring`.
================
Comment at: lld/MachO/InputFiles.cpp:622-628
+ if (isZeroFill(isec->flags)) {
+ nextIsec->data = {nullptr, isec->data.size() - symbolOffset};
+ isec->data = {nullptr, symbolOffset};
+ } else {
+ nextIsec->data = isec->data.slice(symbolOffset);
+ isec->data = isec->data.slice(0, symbolOffset);
+ }
----------------
int3 wrote:
> why is this change necessary?
Not strictly necessary. I saw SEGVs when checking `isec->data.data() == nullptr` because `parseSymbols` would advance the initial `nullptr` for a zero-fill section at every split. The alternative is to designate `data.data()` as undefined for zero-fill sections.
================
Comment at: lld/MachO/InputSection.h:57
+ // Equivalence-class ID for ICF
+ uint64_t icfEquivalenceClass[2] = {0, 0};
+ // Track the defined symbol that references this - for ICF logs
----------------
int3 wrote:
> I roughly get why we have two entries here, but a comment explaining it would be nice
It is likely best to refer the reader to the block comment in `lld/ELF/ICF.cpp` where the segregation algorithm is documented.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103292/new/
https://reviews.llvm.org/D103292
More information about the llvm-commits
mailing list