[lld] d5a70db - [lld/mac] Write every weak symbol only once in the output

Nico Weber via llvm-commits llvm-commits at lists.llvm.org
Fri May 7 14:11:56 PDT 2021


Author: Nico Weber
Date: 2021-05-07T17:11:40-04:00
New Revision: d5a70db1938c06380bdab033b7d47a7437914f4c

URL: https://github.com/llvm/llvm-project/commit/d5a70db1938c06380bdab033b7d47a7437914f4c
DIFF: https://github.com/llvm/llvm-project/commit/d5a70db1938c06380bdab033b7d47a7437914f4c.diff

LOG: [lld/mac] Write every weak symbol only once in the output

Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.

Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:

        N           Min           Max        Median           Avg        Stddev
    x  10     3.9957051     4.3496981     4.1411121      4.156837    0.10092097
    +  10      3.908154      4.169318     3.9712729     3.9846753   0.075773012
    Difference at 95.0% confidence
            -0.172162 +/- 0.083847
            -4.14165% +/- 2.01709%
            (Student's t, pooled s = 0.0892373)

Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)

Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
  Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
  (that is, catch block unwind information) and Personality Routines
  associated with weak functions still not stripped. This is wasteful,
  but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
  correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
  one symbol (eg from .alt_entry) for now

Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
  found it a bit more explicit)
- exports

Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).

This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.

I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.

Differential Revision: https://reviews.llvm.org/D102076

Added: 
    lld/test/MachO/weak-definition-gc.s

Modified: 
    lld/MachO/InputFiles.cpp
    lld/MachO/InputSection.cpp
    lld/MachO/InputSection.h
    lld/MachO/MapFile.cpp
    lld/MachO/SymbolTable.cpp
    lld/MachO/Symbols.h
    lld/MachO/UnwindInfoSection.cpp
    lld/MachO/Writer.cpp

Removed: 
    


################################################################################
diff  --git a/lld/MachO/InputFiles.cpp b/lld/MachO/InputFiles.cpp
index 7df21616aa3f8..d33c3d50fbbc4 100644
--- a/lld/MachO/InputFiles.cpp
+++ b/lld/MachO/InputFiles.cpp
@@ -599,6 +599,8 @@ void ObjFile::parseSymbols(ArrayRef<typename LP::section> sectionHeaders,
       // There are 3 cases where we do not need to create a new subsection:
       //   1. If the input file does not use subsections-via-symbols.
       //   2. Multiple symbols at the same address only induce one subsection.
+      //      (The symbolOffset == 0 check covers both this case as well as
+      //      the first loop iteration.)
       //   3. Alternative entry points do not induce new subsections.
       if (!subsectionsViaSymbols || symbolOffset == 0 ||
           sym.n_desc & N_ALT_ENTRY) {
@@ -609,6 +611,8 @@ void ObjFile::parseSymbols(ArrayRef<typename LP::section> sectionHeaders,
 
       auto *nextIsec = make<InputSection>(*isec);
       nextIsec->data = isec->data.slice(symbolOffset);
+      nextIsec->numRefs = 0;
+      nextIsec->canOmitFromOutput = false;
       isec->data = isec->data.slice(0, symbolOffset);
 
       // By construction, the symbol will be at offset zero in the new

diff  --git a/lld/MachO/InputSection.cpp b/lld/MachO/InputSection.cpp
index 50f2732c4c9b5..230bef27cb943 100644
--- a/lld/MachO/InputSection.cpp
+++ b/lld/MachO/InputSection.cpp
@@ -51,6 +51,8 @@ static uint64_t resolveSymbolVA(uint8_t *loc, const Symbol &sym, uint8_t type) {
 }
 
 void InputSection::writeTo(uint8_t *buf) {
+  assert(!shouldOmitFromOutput());
+
   if (getFileSize() == 0)
     return;
 
@@ -66,8 +68,11 @@ void InputSection::writeTo(uint8_t *buf) {
       uint64_t minuendVA;
       if (const Symbol *toSym = minuend.referent.dyn_cast<Symbol *>())
         minuendVA = toSym->getVA();
-      else
-        minuendVA = minuend.referent.get<InputSection *>()->getVA();
+      else {
+        auto *referentIsec = minuend.referent.get<InputSection *>();
+        assert(!referentIsec->shouldOmitFromOutput());
+        minuendVA = referentIsec->getVA();
+      }
       referentVA = minuendVA - fromSym->getVA() + minuend.addend;
     } else if (auto *referentSym = r.referent.dyn_cast<Symbol *>()) {
       if (target->hasAttr(r.type, RelocAttrBits::LOAD) &&
@@ -84,6 +89,7 @@ void InputSection::writeTo(uint8_t *buf) {
           referentVA -= firstTLVDataSection->addr;
       }
     } else if (auto *referentIsec = r.referent.dyn_cast<InputSection *>()) {
+      assert(!referentIsec->shouldOmitFromOutput());
       referentVA = referentIsec->getVA();
     }
     target->relocateOne(loc, r, referentVA + r.addend, getVA() + r.offset);

diff  --git a/lld/MachO/InputSection.h b/lld/MachO/InputSection.h
index 04dee3bf17afe..0b1a4e1c1e870 100644
--- a/lld/MachO/InputSection.h
+++ b/lld/MachO/InputSection.h
@@ -21,8 +21,6 @@ namespace macho {
 class InputFile;
 class InputSection;
 class OutputSection;
-class Symbol;
-class Defined;
 
 class InputSection {
 public:
@@ -45,6 +43,21 @@ class InputSection {
   uint32_t align = 1;
   uint32_t flags = 0;
 
+  // How many symbols refer to this InputSection.
+  uint32_t numRefs = 0;
+
+  // True if this InputSection could not be written to the output file.
+  // With subsections_via_symbols, most symbol have its own InputSection,
+  // and for weak symbols (e.g. from inline functions), only the
+  // InputSection from one translation unit will make it to the output,
+  // while all copies in other translation units are coalesced into the
+  // first and not copied to the output.
+  bool canOmitFromOutput = false;
+
+  bool shouldOmitFromOutput() const {
+    return canOmitFromOutput && numRefs == 0;
+  }
+
   ArrayRef<uint8_t> data;
   std::vector<Reloc> relocs;
 };

diff  --git a/lld/MachO/MapFile.cpp b/lld/MachO/MapFile.cpp
index 2262a8cb890d2..b52e509285913 100644
--- a/lld/MachO/MapFile.cpp
+++ b/lld/MachO/MapFile.cpp
@@ -67,8 +67,11 @@ static std::vector<Defined *> getSymbols() {
         if (sym == nullptr)
           continue;
         if (auto *d = dyn_cast<Defined>(sym))
-          if (d->isec && d->getFile() == file)
+          if (d->isec && d->getFile() == file) {
+            assert(!d->isec->shouldOmitFromOutput() &&
+                   "file->symbols should store resolved symbols");
             v.push_back(d);
+          }
       }
   return v;
 }
@@ -144,7 +147,11 @@ void macho::writeMapFile() {
   os << "# Symbols:\n";
   os << "# Address\t    File  Name\n";
   for (InputSection *isec : inputSections) {
-    for (Symbol *sym : sectionSyms[isec]) {
+    auto symsIt = sectionSyms.find(isec);
+    assert(!isec->shouldOmitFromOutput() || (symsIt == sectionSyms.end()));
+    if (symsIt == sectionSyms.end())
+      continue;
+    for (Symbol *sym : symsIt->second) {
       os << format("0x%08llX\t[%3u] %s\n", sym->getVA(),
                    readerToFileOrdinal[sym->getFile()], symStr[sym].c_str());
     }

diff  --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp
index 51b11a0732542..ff55119e167f1 100644
--- a/lld/MachO/SymbolTable.cpp
+++ b/lld/MachO/SymbolTable.cpp
@@ -51,13 +51,26 @@ Defined *SymbolTable::addDefined(StringRef name, InputFile *file,
   bool overridesWeakDef = false;
   std::tie(s, wasInserted) = insert(name, file);
 
+  assert(!isWeakDef || (isa<BitcodeFile>(file) && !isec) ||
+         (isa<ObjFile>(file) && file == isec->file));
+
   if (!wasInserted) {
     if (auto *defined = dyn_cast<Defined>(s)) {
       if (isWeakDef) {
         // Both old and new symbol weak (e.g. inline function in two TUs):
         // If one of them isn't private extern, the merged symbol isn't.
-        if (defined->isWeakDef())
+        if (defined->isWeakDef()) {
           defined->privateExtern &= isPrivateExtern;
+
+          // FIXME: Handle this for bitcode files.
+          // FIXME: We currently only do this if both symbols are weak.
+          //        We could do this if either is weak (but getting the
+          //        case where !isWeakDef && defined->isWeakDef() right
+          //        requires some care and testing).
+          if (isec)
+            isec->canOmitFromOutput = true;
+        }
+
         return defined;
       }
       if (!defined->isWeakDef())

diff  --git a/lld/MachO/Symbols.h b/lld/MachO/Symbols.h
index c6375e1c73420..7704f7e92c53c 100644
--- a/lld/MachO/Symbols.h
+++ b/lld/MachO/Symbols.h
@@ -106,7 +106,10 @@ class Defined : public Symbol {
       : Symbol(DefinedKind, name, file), isec(isec), value(value), size(size),
         overridesWeakDef(false), privateExtern(isPrivateExtern),
         includeInSymtab(true), thumb(isThumb), weakDef(isWeakDef),
-        external(isExternal) {}
+        external(isExternal) {
+    if (isec)
+      isec->numRefs++;
+  }
 
   bool isWeakDef() const override { return weakDef; }
   bool isExternalWeakDef() const {

diff  --git a/lld/MachO/UnwindInfoSection.cpp b/lld/MachO/UnwindInfoSection.cpp
index 370e1408bd3a3..76502bfb82bca 100644
--- a/lld/MachO/UnwindInfoSection.cpp
+++ b/lld/MachO/UnwindInfoSection.cpp
@@ -143,6 +143,8 @@ template <class Ptr>
 void UnwindInfoSectionImpl<Ptr>::prepareRelocations(InputSection *isec) {
   assert(isec->segname == segment_names::ld &&
          isec->name == section_names::compactUnwind);
+  assert(!isec->shouldOmitFromOutput() &&
+         "__compact_unwind section should not be omitted");
 
   for (Reloc &r : isec->relocs) {
     assert(target->hasAttr(r.type, RelocAttrBits::UNSIGNED));
@@ -175,6 +177,8 @@ void UnwindInfoSectionImpl<Ptr>::prepareRelocations(InputSection *isec) {
     }
 
     if (auto *referentIsec = r.referent.dyn_cast<InputSection *>()) {
+      assert(!referentIsec->shouldOmitFromOutput());
+
       // Personality functions can be referenced via section relocations
       // if they live in the same object file. Create placeholder synthetic
       // symbols for them in the GOT.
@@ -211,6 +215,8 @@ static void
 relocateCompactUnwind(MergedOutputSection *compactUnwindSection,
                       std::vector<CompactUnwindEntry<Ptr>> &cuVector) {
   for (const InputSection *isec : compactUnwindSection->inputs) {
+    assert(isec->parent == compactUnwindSection);
+
     uint8_t *buf =
         reinterpret_cast<uint8_t *>(cuVector.data()) + isec->outSecFileOff;
     memcpy(buf, isec->data.data(), isec->data.size());
@@ -229,7 +235,10 @@ relocateCompactUnwind(MergedOutputSection *compactUnwindSection,
         }
       } else if (auto *referentIsec = r.referent.dyn_cast<InputSection *>()) {
         checkTextSegment(referentIsec);
-        referentVA = referentIsec->getVA() + r.addend;
+        if (referentIsec->shouldOmitFromOutput())
+          referentVA = UINT64_MAX; // Tombstone value
+        else
+          referentVA = referentIsec->getVA() + r.addend;
       }
 
       writeAddress(buf + r.offset, referentVA, r.length);
@@ -295,6 +304,18 @@ template <class Ptr> void UnwindInfoSectionImpl<Ptr>::finalize() {
     return a->functionAddress < b->functionAddress;
   });
 
+  // Dead-stripped functions get a functionAddress of UINT64_MAX in
+  // relocateCompactUnwind(). Filter them out here.
+  CompactUnwindEntry<Ptr> tombstone;
+  tombstone.functionAddress = static_cast<Ptr>(UINT64_MAX);
+  cuPtrVector.erase(
+      std::lower_bound(cuPtrVector.begin(), cuPtrVector.end(), &tombstone,
+                       [](const CompactUnwindEntry<Ptr> *a,
+                          const CompactUnwindEntry<Ptr> *b) {
+                         return a->functionAddress < b->functionAddress;
+                       }),
+      cuPtrVector.end());
+
   // Fold adjacent entries with matching encoding+personality+lsda
   // We use three iterators on the same cuPtrVector to fold in-situ:
   // (1) `foldBegin` is the first of a potential sequence of matching entries

diff  --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
index a4da569006cf2..257d3bde93081 100644
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -570,6 +570,9 @@ static void prepareSymbolRelocation(Symbol *sym, const InputSection *isec,
 void Writer::scanRelocations() {
   TimeTraceScope timeScope("Scan relocations");
   for (InputSection *isec : inputSections) {
+    if (isec->shouldOmitFromOutput())
+      continue;
+
     if (isec->segname == segment_names::ld) {
       in.unwindInfo->prepareRelocations(isec);
       continue;
@@ -581,7 +584,7 @@ void Writer::scanRelocations() {
         // Skip over the following UNSIGNED relocation -- it's just there as the
         // minuend, and doesn't have the usual UNSIGNED semantics. We don't want
         // to emit rebase opcodes for it.
-        it = std::next(it);
+        it++;
         continue;
       }
       if (auto *sym = r.referent.dyn_cast<Symbol *>()) {
@@ -592,6 +595,7 @@ void Writer::scanRelocations() {
           prepareSymbolRelocation(sym, isec, r);
       } else {
         assert(r.referent.is<InputSection *>());
+        assert(!r.referent.get<InputSection *>()->shouldOmitFromOutput());
         if (!r.pcrel)
           in.rebase->addEntry(isec, r.offset);
       }
@@ -894,6 +898,8 @@ template <class LP> void Writer::createOutputSections() {
   // Then merge input sections into output sections.
   MapVector<NamePair, MergedOutputSection *> mergedOutputSections;
   for (InputSection *isec : inputSections) {
+    if (isec->shouldOmitFromOutput())
+      continue;
     NamePair names = maybeRenameSection({isec->segname, isec->name});
     MergedOutputSection *&osec = mergedOutputSections[names];
     if (osec == nullptr)

diff  --git a/lld/test/MachO/weak-definition-gc.s b/lld/test/MachO/weak-definition-gc.s
new file mode 100644
index 0000000000000..aea5a4cf5bc00
--- /dev/null
+++ b/lld/test/MachO/weak-definition-gc.s
@@ -0,0 +1,131 @@
+# REQUIRES: x86
+# RUN: rm -rf %t; split-file %s %t
+
+## This tests that if two input files define the same weak symbol, we only
+## write it to the output once (...assuming both input files use
+## .subsections_via_symbols).
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/weak-sub.s -o %t/weak-sub.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/weak-nosub.s -o %t/weak-nosub.o
+
+## Test that weak symbols are emitted just once with .subsections_via_symbols
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-sub.o %t/weak-sub.o
+# RUN: llvm-otool -jtV %t/out.dylib | FileCheck --check-prefix=SUB %s
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-nosub.o %t/weak-sub.o
+# RUN: llvm-otool -jtV %t/out.dylib | FileCheck --check-prefix=SUB %s
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-sub.o %t/weak-nosub.o
+# RUN: llvm-otool -jtV %t/out.dylib | FileCheck --check-prefix=SUB %s
+# SUB:      _foo
+# SUB-NEXT: retq
+# SUB-NOT:  retq
+# SUB:      _bar
+# SUB-NEXT: retq
+# SUB-NOT:  retq
+
+## We can even strip weak symbols without subsections_via_symbols as long
+## as none of the weak symbols in a section are needed.
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-nosub.o %t/weak-nosub.o
+# RUN: llvm-otool -jtV %t/out.dylib | FileCheck --check-prefix=SUB %s
+
+## Test that omitted weak symbols don't add entries to the compact unwind table.
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin19.0.0 %t/weak-sub-lsda.s -o %t/weak-sub-lsda.o
+# RUN: %lld -dylib -lc++ -o %t/out.dylib %t/weak-sub-lsda.o %t/weak-sub-lsda.o
+# RUN: llvm-objdump --macho --unwind-info --syms %t/out.dylib | FileCheck %s --check-prefix=UNWIND -D#%x,BASE=0
+
+# UNWIND:      SYMBOL TABLE:
+# UNWIND-DAG:  [[#%x,FOO:]]       w  F __TEXT,__text _foo
+# UNWIND-NOT:                          __TEXT,__text _foo
+
+# UNWIND:      Contents of __unwind_info section:
+# UNWIND:        LSDA descriptors:
+# UNWIND:           [0]: function offset=0x[[#%.8x,FOO-BASE]]
+# UNWIND-NOT:       [1]:
+# UNWIND:        Second level indices:
+# UNWIND-DAG:      [0]: function offset=0x[[#%.8x,FOO-BASE]]
+# UNWIND-NOT:      [1]:
+
+## Test interaction with .alt_entry
+## FIXME: ld64 manages to strip both one copy of _foo and _bar each.
+##        We only manage this if we're lucky and the object files are in
+##        the right order. We're happy to not crash at link time for now.
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin19.0.0 %t/weak-sub-alt.s -o %t/weak-sub-alt.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin19.0.0 %t/weak-sub-alt2.s -o %t/weak-sub-alt2.o
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-sub-alt.o %t/weak-sub-alt2.o
+# RUN: %lld -dylib -o %t/out.dylib %t/weak-sub-alt2.o %t/weak-sub-alt.o
+
+#--- weak-sub.s
+.globl _foo, _bar
+.weak_definition _foo, _bar
+_foo:
+  retq
+_bar:
+  retq
+.subsections_via_symbols
+
+#--- weak-nosub.s
+.globl _foo, _bar
+.weak_definition _foo, _bar
+_foo:
+  retq
+_bar:
+  retq
+
+#--- weak-sub-lsda.s
+.section __TEXT,__text,regular,pure_instructions
+
+.globl _foo
+.weak_definition _foo
+_foo:
+  .cfi_startproc
+  .cfi_personality 155, ___gxx_personality_v0
+  .cfi_lsda 16, Lexception
+  pushq %rbp
+  .cfi_def_cfa_offset 128
+  .cfi_offset %rbp, 48
+  movq %rsp, %rbp
+  .cfi_def_cfa_register %rbp
+  popq %rbp
+  retq
+  .cfi_endproc
+
+.section __TEXT,__gcc_except_tab
+Lexception:
+    .space 0x10
+
+.subsections_via_symbols
+
+#--- weak-sub-alt.s
+.globl _foo, _bar
+.weak_definition _foo
+_foo:
+  retq
+
+# Alternative entry point to _foo (strong)
+.alt_entry _bar
+_bar:
+  retq
+
+.globl _main, _ref
+_main:
+  callq _ref
+  callq _bar
+
+.subsections_via_symbols
+
+#--- weak-sub-alt2.s
+.globl _foo, _bar
+.weak_definition _foo
+_foo:
+  retq
+
+# Alternative entry point to _foo (weak)
+.weak_definition _bar
+.alt_entry _bar
+_bar:
+  retq
+
+.globl _ref
+_ref:
+  callq _bar
+
+.subsections_via_symbols


        


More information about the llvm-commits mailing list