[lld] 3a11528 - [lld-macho] Move ICF earlier to avoid emitting redundant binds

Thu Jul 1 18:22:59 PDT 2021

Author: Jez Ng
Date: 2021-07-01T21:22:38-04:00
New Revision: 3a11528d97a788781de82f939f502abe7fbd729d

URL: https://github.com/llvm/llvm-project/commit/3a11528d97a788781de82f939f502abe7fbd729d
DIFF: https://github.com/llvm/llvm-project/commit/3a11528d97a788781de82f939f502abe7fbd729d.diff

LOG: [lld-macho] Move ICF earlier to avoid emitting redundant binds

This is a pretty big refactoring diff, so here are the motivations:

Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.

However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.

In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.

Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.

One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.

The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.

I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.

This change is perf-neutral when linking chromium_framework.

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D105044

Added: 
    

Modified: 
    lld/MachO/ConcatOutputSection.cpp
    lld/MachO/ConcatOutputSection.h
    lld/MachO/Driver.cpp
    lld/MachO/ICF.cpp
    lld/MachO/ICF.h
    lld/MachO/InputSection.cpp
    lld/MachO/InputSection.h
    lld/MachO/MarkLive.cpp
    lld/MachO/SyntheticSections.cpp
    lld/MachO/SyntheticSections.h
    lld/MachO/UnwindInfoSection.cpp
    lld/MachO/UnwindInfoSection.h
    lld/MachO/Writer.cpp

Removed: 
    


################################################################################
diff  --git a/lld/MachO/ConcatOutputSection.cpp b/lld/MachO/ConcatOutputSection.cpp
index ca4ab2d2381a..1c3c055b8930 100644

--- a/lld/MachO/ConcatOutputSection.cpp
+++ b/lld/MachO/ConcatOutputSection.cpp
@@ -355,12 +355,3 @@ void ConcatOutputSection::finalizeFlags(InputSection *input) {
     break;
   }
 }
-
-void ConcatOutputSection::eraseOmittedInputSections() {
-  // Remove the duplicates from inputs
-  inputs.erase(std::remove_if(inputs.begin(), inputs.end(),
-                              [](const ConcatInputSection *isec) -> bool {
-                                return isec->shouldOmitFromOutput();
-                              }),
-               inputs.end());
-}

diff  --git a/lld/MachO/ConcatOutputSection.h b/lld/MachO/ConcatOutputSection.h
index f8332f6410d4..531983cd9c13 100644
--- a/lld/MachO/ConcatOutputSection.h
+++ b/lld/MachO/ConcatOutputSection.h
@@ -40,7 +40,6 @@ class ConcatOutputSection final : public OutputSection {
   void finalize() override;
   bool needsThunks() const;
   uint64_t estimateStubsInRangeVA(size_t callIdx) const;
-  void eraseOmittedInputSections();
 
   void writeTo(uint8_t *buf) const override;
 

diff  --git a/lld/MachO/Driver.cpp b/lld/MachO/Driver.cpp
index 23b505d7b947..a64d34ba5689 100644
--- a/lld/MachO/Driver.cpp
+++ b/lld/MachO/Driver.cpp
@@ -8,6 +8,7 @@
 
 #include "Driver.h"
 #include "Config.h"
+#include "ICF.h"
 #include "InputFiles.h"
 #include "LTO.h"
 #include "MarkLive.h"
@@ -18,6 +19,7 @@
 #include "Symbols.h"
 #include "SyntheticSections.h"
 #include "Target.h"
+#include "UnwindInfoSection.h"
 #include "Writer.h"
 
 #include "lld/Common/Args.h"
@@ -983,6 +985,48 @@ void createFiles(const InputArgList &args) {
   }
 }
 
+static void gatherInputSections() {
+  TimeTraceScope timeScope("Gathering input sections");
+  int inputOrder = 0;
+  for (const InputFile *file : inputFiles) {
+    for (const SubsectionMap &map : file->subsections) {
+      for (const SubsectionEntry &entry : map) {
+        if (auto *isec = dyn_cast<ConcatInputSection>(entry.isec)) {
+          if (isec->isCoalescedWeak())
+            continue;
+          if (isec->segname == segment_names::ld) {
+            assert(isec->name == section_names::compactUnwind);
+            in.unwindInfo->addInput(isec);
+            continue;
+          }
+          isec->outSecOff = inputOrder++;
+          inputSections.push_back(isec);
+        } else if (auto *isec = dyn_cast<CStringInputSection>(entry.isec)) {
+          if (in.cStringSection->inputOrder == UnspecifiedInputOrder)
+            in.cStringSection->inputOrder = inputOrder++;
+          in.cStringSection->addInput(isec);
+        } else if (auto *isec = dyn_cast<WordLiteralInputSection>(entry.isec)) {
+          if (in.wordLiteralSection->inputOrder == UnspecifiedInputOrder)
+            in.wordLiteralSection->inputOrder = inputOrder++;
+          in.wordLiteralSection->addInput(isec);
+        } else {
+          llvm_unreachable("unexpected input section kind");
+        }
+      }
+    }
+  }
+  assert(inputOrder <= UnspecifiedInputOrder);
+}
+
+static void foldIdenticalLiterals() {
+  // We always create a cStringSection, regardless of whether dedupLiterals is
+  // true. If it isn't, we simply create a non-deduplicating CStringSection.
+  // Either way, we must unconditionally finalize it here.
+  in.cStringSection->finalizeContents();
+  if (in.wordLiteralSection)
+    in.wordLiteralSection->finalizeContents();
+}
+
 bool macho::link(ArrayRef<const char *> argsArr, bool canExitEarly,
                  raw_ostream &stdoutOS, raw_ostream &stderrOS) {
   lld::stdoutOS = &stdoutOS;
@@ -1344,25 +1388,17 @@ bool macho::link(ArrayRef<const char *> argsArr, bool canExitEarly,
         inputFiles.insert(make<OpaqueFile>(*buffer, segName, sectName));
     }
 
-    {
-      TimeTraceScope timeScope("Gathering input sections");
-      // Gather all InputSections into one vector.
-      for (const InputFile *file : inputFiles) {
-        for (const SubsectionMap &map : file->subsections) {
-          for (const SubsectionEntry &entry : map) {
-            if (auto concatIsec = dyn_cast<ConcatInputSection>(entry.isec))
-              if (concatIsec->isCoalescedWeak())
-                continue;
-            inputSections.push_back(entry.isec);
-          }
-        }
-      }
-      assert(inputSections.size() < UnspecifiedInputOrder);
-    }
+    gatherInputSections();
 
     if (config->deadStrip)
       markLive();
 
+    // ICF assumes that all literals have been folded already, so we must run
+    // foldIdenticalLiterals before foldIdenticalSections.
+    foldIdenticalLiterals();
+    if (config->icfLevel != ICFLevel::none)
+      foldIdenticalSections();
+
     // Write to an output file.
     if (target->wordSize == 8)
       writeResult<LP64>();

diff  --git a/lld/MachO/ICF.cpp b/lld/MachO/ICF.cpp
index 4ff8c578d56c..c1b8325d2a6c 100644
--- a/lld/MachO/ICF.cpp
+++ b/lld/MachO/ICF.cpp
@@ -10,7 +10,10 @@
 #include "ConcatOutputSection.h"
 #include "InputSection.h"
 #include "Symbols.h"
+#include "UnwindInfoSection.h"
+
 #include "llvm/Support/Parallel.h"
+#include "llvm/Support/TimeProfiler.h"
 
 #include <atomic>
 
@@ -18,6 +21,25 @@ using namespace llvm;
 using namespace lld;
 using namespace lld::macho;
 
+class ICF {
+public:
+  ICF(std::vector<ConcatInputSection *> &inputs);
+
+  void run();
+  void segregate(size_t begin, size_t end,
+                 std::function<bool(const ConcatInputSection *,
+                                    const ConcatInputSection *)>
+                     equals);
+  size_t findBoundary(size_t begin, size_t end);
+  void forEachClassRange(size_t begin, size_t end,
+                         std::function<void(size_t, size_t)> func);
+  void forEachClass(std::function<void(size_t, size_t)> func);
+
+  // ICF needs a copy of the inputs vector because its equivalence-class
+  // segregation algorithm destroys the proper sequence.
+  std::vector<ConcatInputSection *> icfInputs;
+};
+
 ICF::ICF(std::vector<ConcatInputSection *> &inputs) {
   icfInputs.assign(inputs.begin(), inputs.end());
 }
@@ -276,3 +298,59 @@ void ICF::segregate(
     begin = mid;
   }
 }
+
+template <class Ptr>
+DenseSet<const InputSection *> findFunctionsWithUnwindInfo() {
+  DenseSet<const InputSection *> result;
+  for (ConcatInputSection *isec : in.unwindInfo->getInputs()) {
+    for (size_t i = 0; i < isec->relocs.size(); ++i) {
+      Reloc &r = isec->relocs[i];
+      assert(target->hasAttr(r.type, RelocAttrBits::UNSIGNED));
+      if (r.offset % sizeof(CompactUnwindEntry<Ptr>) !=
+          offsetof(CompactUnwindEntry<Ptr>, functionAddress))
+        continue;
+      result.insert(r.referent.get<InputSection *>());
+    }
+  }
+  return result;
+}
+
+void macho::foldIdenticalSections() {
+  TimeTraceScope timeScope("Fold Identical Code Sections");
+  // The ICF equivalence-class segregation algorithm relies on pre-computed
+  // hashes of InputSection::data for the ConcatOutputSection::inputs and all
+  // sections referenced by their relocs. We could recursively traverse the
+  // relocs to find every referenced InputSection, but that precludes easy
+  // parallelization. Therefore, we hash every InputSection here where we have
+  // them all accessible as simple vectors.
+  std::vector<ConcatInputSection *> codeSections;
+
+  // ICF can't fold functions with unwind info
+  DenseSet<const InputSection *> functionsWithUnwindInfo =
+      target->wordSize == 8 ? findFunctionsWithUnwindInfo<uint64_t>()
+                            : findFunctionsWithUnwindInfo<uint32_t>();
+
+  // If an InputSection is ineligible for ICF, we give it a unique ID to force
+  // it into an unfoldable singleton equivalence class.  Begin the unique-ID
+  // space at inputSections.size(), so that it will never intersect with
+  // equivalence-class IDs which begin at 0. Since hashes & unique IDs never
+  // coexist with equivalence-class IDs, this is not necessary, but might help
+  // someone keep the numbers straight in case we ever need to debug the
+  // ICF::segregate()
+  uint64_t icfUniqueID = inputSections.size();
+  for (ConcatInputSection *isec : inputSections) {
+    bool isHashable = isCodeSection(isec) && !isec->shouldOmitFromOutput() &&
+                      !functionsWithUnwindInfo.contains(isec) &&
+                      isec->isHashableForICF();
+    if (isHashable) {
+      codeSections.push_back(isec);
+    } else {
+      isec->icfEqClass[0] = ++icfUniqueID;
+    }
+  }
+  parallelForEach(codeSections,
+                  [](ConcatInputSection *isec) { isec->hashForICF(); });
+  // Now that every input section is either hashed or marked as unique, run the
+  // segregation algorithm to detect foldable subsections.
+  ICF(codeSections).run();
+}

diff  --git a/lld/MachO/ICF.h b/lld/MachO/ICF.h
index 767630f0d7eb..9500a946601e 100644
--- a/lld/MachO/ICF.h
+++ b/lld/MachO/ICF.h
@@ -15,26 +15,7 @@
 namespace lld {
 namespace macho {
 
-class ConcatInputSection;
-
-class ICF {
-public:
-  ICF(std::vector<ConcatInputSection *> &inputs);
-
-  void run();
-  void segregate(size_t begin, size_t end,
-                 std::function<bool(const ConcatInputSection *,
-                                    const ConcatInputSection *)>
-                     equals);
-  size_t findBoundary(size_t begin, size_t end);
-  void forEachClassRange(size_t begin, size_t end,
-                         std::function<void(size_t, size_t)> func);
-  void forEachClass(std::function<void(size_t, size_t)> func);
-
-  // ICF needs a copy of the inputs vector because its equivalence-class
-  // segregation algorithm destroys the proper sequence.
-  std::vector<ConcatInputSection *> icfInputs;
-};
+void foldIdenticalSections();
 
 } // namespace macho
 } // namespace lld

diff  --git a/lld/MachO/InputSection.cpp b/lld/MachO/InputSection.cpp
index 740eea6d8fd4..4ad790377676 100644
--- a/lld/MachO/InputSection.cpp
+++ b/lld/MachO/InputSection.cpp
@@ -25,7 +25,7 @@ using namespace llvm::support;
 using namespace lld;
 using namespace lld::macho;
 
-std::vector<InputSection *> macho::inputSections;
+std::vector<ConcatInputSection *> macho::inputSections;
 
 uint64_t InputSection::getFileSize() const {
   return isZeroFill(flags) ? 0 : getSize();
@@ -48,16 +48,10 @@ static uint64_t resolveSymbolVA(const Symbol *sym, uint8_t type) {
 
 // ICF needs to hash any section that might potentially be duplicated so
 // that it can match on content rather than identity.
-bool ConcatInputSection::isHashableForICF(bool isText) const {
-  if (shouldOmitFromOutput())
-    return false;
+bool ConcatInputSection::isHashableForICF() const {
   switch (sectionType(flags)) {
   case S_REGULAR:
-    if (isText)
-      return !hasPersonality;
-    // One might hope that we could hash __TEXT,__const subsections to fold
-    // references to duplicated values, but alas, many tests fail.
-    return false;
+    return true;
   case S_CSTRING_LITERALS:
   case S_4BYTE_LITERALS:
   case S_8BYTE_LITERALS:

diff  --git a/lld/MachO/InputSection.h b/lld/MachO/InputSection.h
index 9eea39105f14..efa175e0bfc7 100644
--- a/lld/MachO/InputSection.h
+++ b/lld/MachO/InputSection.h
@@ -95,7 +95,7 @@ class ConcatInputSection final : public InputSection {
   void markLive(uint64_t off) override { live = true; }
   bool isCoalescedWeak() const { return wasCoalesced && numRefs == 0; }
   bool shouldOmitFromOutput() const { return !live || isCoalescedWeak(); }
-  bool isHashableForICF(bool isText) const;
+  bool isHashableForICF() const;
   void hashForICF();
   void writeTo(uint8_t *buf);
 
@@ -108,8 +108,6 @@ class ConcatInputSection final : public InputSection {
     return isec->kind() == ConcatKind;
   }
 
-  // ICF can't fold functions with LSDA+personality
-  bool hasPersonality = false;
   // Points to the surviving section after this one is folded by ICF
   InputSection *replacement = nullptr;
   // Equivalence-class ID for ICF
@@ -124,6 +122,9 @@ class ConcatInputSection final : public InputSection {
   bool live = !config->deadStrip;
   // How many symbols refer to this InputSection.
   uint32_t numRefs = 0;
+  // This variable has two usages. Initially, it represents the input order.
+  // After assignAddresses is called, it represents the offset from the
+  // beginning of the output section this section was assigned to.
   uint64_t outSecOff = 0;
 };
 
@@ -256,7 +257,7 @@ inline bool isWordLiteralSection(uint32_t flags) {
 
 bool isCodeSection(const InputSection *);
 
-extern std::vector<InputSection *> inputSections;
+extern std::vector<ConcatInputSection *> inputSections;
 
 namespace section_names {
 

diff  --git a/lld/MachO/MarkLive.cpp b/lld/MachO/MarkLive.cpp
index a63f57d6d0f7..7962ff9b094a 100644
--- a/lld/MachO/MarkLive.cpp
+++ b/lld/MachO/MarkLive.cpp
@@ -101,10 +101,9 @@ void markLive() {
   if (auto *stubBinder =
           dyn_cast_or_null<DylibSymbol>(symtab->find("dyld_stub_binder")))
     addSym(stubBinder);
-  for (InputSection *isec : inputSections) {
+  for (ConcatInputSection *isec : inputSections) {
     // Sections marked no_dead_strip
     if (isec->flags & S_ATTR_NO_DEAD_STRIP) {
-      assert(isa<ConcatInputSection>(isec));
       enqueue(isec, 0);
       continue;
     }
@@ -112,37 +111,33 @@ void markLive() {
     // mod_init_funcs, mod_term_funcs sections
     if (sectionType(isec->flags) == S_MOD_INIT_FUNC_POINTERS ||
         sectionType(isec->flags) == S_MOD_TERM_FUNC_POINTERS) {
-      assert(isa<ConcatInputSection>(isec));
       enqueue(isec, 0);
       continue;
     }
+  }
 
-    // Dead strip runs before UnwindInfoSection handling so we need to keep
-    // __LD,__compact_unwind alive here.
-    // But that section contains absolute references to __TEXT,__text and
-    // keeps most code alive due to that. So we can't just enqueue() the
-    // section: We must skip the relocations for the functionAddress
-    // in each CompactUnwindEntry.
-    // See also scanEhFrameSection() in lld/ELF/MarkLive.cpp.
-    if (isec->segname == segment_names::ld &&
-        isec->name == section_names::compactUnwind) {
-      auto concatIsec = cast<ConcatInputSection>(isec);
-      concatIsec->live = true;
-      const int compactUnwindEntrySize =
-          target->wordSize == 8 ? sizeof(CompactUnwindEntry<uint64_t>)
-                                : sizeof(CompactUnwindEntry<uint32_t>);
-      for (const Reloc &r : isec->relocs) {
-        // This is the relocation for the address of the function itself.
-        // Ignore it, else these would keep everything alive.
-        if (r.offset % compactUnwindEntrySize == 0)
-          continue;
+  // Dead strip runs before UnwindInfoSection handling so we need to keep
+  // __LD,__compact_unwind alive here.
+  // But that section contains absolute references to __TEXT,__text and
+  // keeps most code alive due to that. So we can't just enqueue() the
+  // section: We must skip the relocations for the functionAddress
+  // in each CompactUnwindEntry.
+  // See also scanEhFrameSection() in lld/ELF/MarkLive.cpp.
+  for (ConcatInputSection *isec : in.unwindInfo->getInputs()) {
+    isec->live = true;
+    const int compactUnwindEntrySize =
+        target->wordSize == 8 ? sizeof(CompactUnwindEntry<uint64_t>)
+                              : sizeof(CompactUnwindEntry<uint32_t>);
+    for (const Reloc &r : isec->relocs) {
+      // This is the relocation for the address of the function itself.
+      // Ignore it, else these would keep everything alive.
+      if (r.offset % compactUnwindEntrySize == 0)
+        continue;
 
-        if (auto *s = r.referent.dyn_cast<Symbol *>())
-          addSym(s);
-        else
-          enqueue(r.referent.get<InputSection *>(), r.addend);
-      }
-      continue;
+      if (auto *s = r.referent.dyn_cast<Symbol *>())
+        addSym(s);
+      else
+        enqueue(r.referent.get<InputSection *>(), r.addend);
     }
   }
 
@@ -163,13 +158,10 @@ void markLive() {
 
     // S_ATTR_LIVE_SUPPORT sections are live if they point _to_ a live section.
     // Process them in a second pass.
-    for (InputSection *isec : inputSections) {
-      if (!isa<ConcatInputSection>(isec))
-        continue;
-      auto concatIsec = cast<ConcatInputSection>(isec);
+    for (ConcatInputSection *isec : inputSections) {
       // FIXME: Check if copying all S_ATTR_LIVE_SUPPORT sections into a
       // separate vector and only walking that here is faster.
-      if (!(concatIsec->flags & S_ATTR_LIVE_SUPPORT) || concatIsec->live)
+      if (!(isec->flags & S_ATTR_LIVE_SUPPORT) || isec->live)
         continue;
 
       for (const Reloc &r : isec->relocs) {

diff  --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
index 03976ff2d688..07406650dfd8 100644
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -775,7 +775,7 @@ void SymtabSection::emitStabs() {
     }
 
     StabsEntry symStab;
-    symStab.sect = defined->isec->parent->index;
+    symStab.sect = defined->isec->canonical()->parent->index;
     symStab.strx = stringTableSection.addString(defined->getName());
     symStab.value = defined->getVA();
 
@@ -900,7 +900,7 @@ template <class LP> void SymtabSectionImpl<LP>::writeTo(uint8_t *buf) const {
         nList->n_value = defined->value;
       } else {
         nList->n_type = scope | N_SECT;
-        nList->n_sect = defined->isec->parent->index;
+        nList->n_sect = defined->isec->canonical()->parent->index;
         // For the N_SECT symbol type, n_value is the address of the symbol
         nList->n_value = defined->getVA();
       }
@@ -1255,40 +1255,46 @@ WordLiteralSection::WordLiteralSection()
 
 void WordLiteralSection::addInput(WordLiteralInputSection *isec) {
   isec->parent = this;
-  // We do all processing of the InputSection here, so it will be effectively
-  // finalized.
-  isec->isFinal = true;
-  const uint8_t *buf = isec->data.data();
-  switch (sectionType(isec->flags)) {
-  case S_4BYTE_LITERALS: {
-    for (size_t off = 0, e = isec->data.size(); off < e; off += 4) {
-      if (!isec->isLive(off))
-        continue;
-      uint32_t value = *reinterpret_cast<const uint32_t *>(buf + off);
-      literal4Map.emplace(value, literal4Map.size());
+  inputs.push_back(isec);
+}
+
+void WordLiteralSection::finalizeContents() {
+  for (WordLiteralInputSection *isec : inputs) {
+    // We do all processing of the InputSection here, so it will be effectively
+    // finalized.
+    isec->isFinal = true;
+    const uint8_t *buf = isec->data.data();
+    switch (sectionType(isec->flags)) {
+    case S_4BYTE_LITERALS: {
+      for (size_t off = 0, e = isec->data.size(); off < e; off += 4) {
+        if (!isec->isLive(off))
+          continue;
+        uint32_t value = *reinterpret_cast<const uint32_t *>(buf + off);
+        literal4Map.emplace(value, literal4Map.size());
+      }
+      break;
     }
-    break;
-  }
-  case S_8BYTE_LITERALS: {
-    for (size_t off = 0, e = isec->data.size(); off < e; off += 8) {
-      if (!isec->isLive(off))
-        continue;
-      uint64_t value = *reinterpret_cast<const uint64_t *>(buf + off);
-      literal8Map.emplace(value, literal8Map.size());
+    case S_8BYTE_LITERALS: {
+      for (size_t off = 0, e = isec->data.size(); off < e; off += 8) {
+        if (!isec->isLive(off))
+          continue;
+        uint64_t value = *reinterpret_cast<const uint64_t *>(buf + off);
+        literal8Map.emplace(value, literal8Map.size());
+      }
+      break;
     }
-    break;
-  }
-  case S_16BYTE_LITERALS: {
-    for (size_t off = 0, e = isec->data.size(); off < e; off += 16) {
-      if (!isec->isLive(off))
-        continue;
-      UInt128 value = *reinterpret_cast<const UInt128 *>(buf + off);
-      literal16Map.emplace(value, literal16Map.size());
+    case S_16BYTE_LITERALS: {
+      for (size_t off = 0, e = isec->data.size(); off < e; off += 16) {
+        if (!isec->isLive(off))
+          continue;
+        UInt128 value = *reinterpret_cast<const UInt128 *>(buf + off);
+        literal16Map.emplace(value, literal16Map.size());
+      }
+      break;
+    }
+    default:
+      llvm_unreachable("invalid literal section type");
     }
-    break;
-  }
-  default:
-    llvm_unreachable("invalid literal section type");
   }
 }
 

diff  --git a/lld/MachO/SyntheticSections.h b/lld/MachO/SyntheticSections.h
index 73d4c3153438..cf464b866d33 100644
--- a/lld/MachO/SyntheticSections.h
+++ b/lld/MachO/SyntheticSections.h
@@ -557,6 +557,7 @@ class WordLiteralSection final : public SyntheticSection {
 
   WordLiteralSection();
   void addInput(WordLiteralInputSection *);
+  void finalizeContents();
   void writeTo(uint8_t *buf) const override;
 
   uint64_t getSize() const override {
@@ -584,6 +585,8 @@ class WordLiteralSection final : public SyntheticSection {
   }
 
 private:
+  std::vector<WordLiteralInputSection *> inputs;
+
   template <class T> struct Hasher {
     llvm::hash_code operator()(T v) const { return llvm::hash_value(v); }
   };

diff  --git a/lld/MachO/UnwindInfoSection.cpp b/lld/MachO/UnwindInfoSection.cpp
index f4bd08d2f3cb..567e18a607e2 100644
--- a/lld/MachO/UnwindInfoSection.cpp
+++ b/lld/MachO/UnwindInfoSection.cpp
@@ -103,9 +103,11 @@ struct SecondLevelPage {
   EncodingMap localEncodingIndexes;
 };
 
-template <class Ptr> class UnwindInfoSectionImpl : public UnwindInfoSection {
+template <class Ptr>
+class UnwindInfoSectionImpl final : public UnwindInfoSection {
 public:
   void prepareRelocations(ConcatInputSection *) override;
+  void addInput(ConcatInputSection *) override;
   void finalize() override;
   void writeTo(uint8_t *buf) const override;
 
@@ -126,6 +128,25 @@ template <class Ptr> class UnwindInfoSectionImpl : public UnwindInfoSection {
   uint64_t level2PagesOffset = 0;
 };
 
+UnwindInfoSection::UnwindInfoSection()
+    : SyntheticSection(segment_names::text, section_names::unwindInfo) {
+  align = 4;
+  compactUnwindSection =
+      make<ConcatOutputSection>(section_names::compactUnwind);
+}
+
+void UnwindInfoSection::prepareRelocations() {
+  for (ConcatInputSection *isec : compactUnwindSection->inputs)
+    prepareRelocations(isec);
+}
+
+template <class Ptr>
+void UnwindInfoSectionImpl<Ptr>::addInput(ConcatInputSection *isec) {
+  assert(isec->segname == segment_names::ld &&
+         isec->name == section_names::compactUnwind);
+  compactUnwindSection->addInput(isec);
+}
+
 // Compact unwind relocations have 
diff erent semantics, so we handle them in a
 // separate code path from regular relocations. First, we do not wish to add
 // rebase opcodes for __LD,__compact_unwind, because that section doesn't
@@ -133,8 +154,6 @@ template <class Ptr> class UnwindInfoSectionImpl : public UnwindInfoSection {
 // reside in the GOT and must be treated specially.
 template <class Ptr>
 void UnwindInfoSectionImpl<Ptr>::prepareRelocations(ConcatInputSection *isec) {
-  assert(isec->segname == segment_names::ld &&
-         isec->name == section_names::compactUnwind);
   assert(!isec->shouldOmitFromOutput() &&
          "__compact_unwind section should not be omitted");
 
@@ -150,13 +169,6 @@ void UnwindInfoSectionImpl<Ptr>::prepareRelocations(ConcatInputSection *isec) {
         offsetof(CompactUnwindEntry<Ptr>, personality))
       continue;
 
-    Reloc &rFunc = isec->relocs[++i];
-    assert(r.offset ==
-           rFunc.offset + offsetof(CompactUnwindEntry<Ptr>, personality));
-    auto *referentIsec =
-        cast<ConcatInputSection>(rFunc.referent.get<InputSection *>());
-    referentIsec->hasPersonality = true;
-
     if (auto *s = r.referent.dyn_cast<Symbol *>()) {
       if (auto *undefined = dyn_cast<Undefined>(s)) {
         treatUndefinedSymbol(*undefined);

diff  --git a/lld/MachO/UnwindInfoSection.h b/lld/MachO/UnwindInfoSection.h
index 7ccf7a4dfde7..11503aaed0da 100644
--- a/lld/MachO/UnwindInfoSection.h
+++ b/lld/MachO/UnwindInfoSection.h
@@ -27,21 +27,21 @@ template <class Ptr> struct CompactUnwindEntry {
 
 class UnwindInfoSection : public SyntheticSection {
 public:
-  bool isNeeded() const override { return compactUnwindSection != nullptr; }
+  bool isNeeded() const override {
+    return !compactUnwindSection->inputs.empty();
+  }
   uint64_t getSize() const override { return unwindInfoSize; }
-  virtual void prepareRelocations(ConcatInputSection *) = 0;
-
-  void setCompactUnwindSection(ConcatOutputSection *cuSection) {
-    compactUnwindSection = cuSection;
+  virtual void addInput(ConcatInputSection *) = 0;
+  std::vector<ConcatInputSection *> getInputs() {
+    return compactUnwindSection->inputs;
   }
+  void prepareRelocations();
 
 protected:
-  UnwindInfoSection()
-      : SyntheticSection(segment_names::text, section_names::unwindInfo) {
-    align = 4;
-  }
+  UnwindInfoSection();
+  virtual void prepareRelocations(ConcatInputSection *) = 0;
 
-  ConcatOutputSection *compactUnwindSection = nullptr;
+  ConcatOutputSection *compactUnwindSection;
   uint64_t unwindInfoSize = 0;
 };
 

diff  --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
index 443cd99b73f4..7e3dc70cd18d 100644
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -9,7 +9,6 @@
 #include "Writer.h"
 #include "ConcatOutputSection.h"
 #include "Config.h"
-#include "ICF.h"
 #include "InputFiles.h"
 #include "InputSection.h"
 #include "MapFile.h"
@@ -52,8 +51,6 @@ class Writer {
   void scanSymbols();
   template <class LP> void createOutputSections();
   template <class LP> void createLoadCommands();
-  void foldIdenticalLiterals();
-  void foldIdenticalSections();
   void finalizeAddresses();
   void finalizeLinkEditSegment();
   void assignAddresses(OutputSegment *);
@@ -592,18 +589,9 @@ static void prepareSymbolRelocation(Symbol *sym, const InputSection *isec,
 
 void Writer::scanRelocations() {
   TimeTraceScope timeScope("Scan relocations");
-  for (InputSection *isec : inputSections) {
-    if (!isa<ConcatInputSection>(isec))
+  for (ConcatInputSection *isec : inputSections) {
+    if (isec->shouldOmitFromOutput())
       continue;
-    auto concatIsec = cast<ConcatInputSection>(isec);
-
-    if (concatIsec->shouldOmitFromOutput())
-      continue;
-
-    if (concatIsec->segname == segment_names::ld) {
-      in.unwindInfo->prepareRelocations(concatIsec);
-      continue;
-    }
 
     for (auto it = isec->relocs.begin(); it != isec->relocs.end(); ++it) {
       Reloc &r = *it;
@@ -621,12 +609,18 @@ void Writer::scanRelocations() {
         if (!isa<Undefined>(sym) && validateSymbolRelocation(sym, isec, r))
           prepareSymbolRelocation(sym, isec, r);
       } else {
-        assert(r.referent.is<InputSection *>());
+        // Canonicalize the referent so that later accesses in Writer won't
+        // have to worry about it. Perhaps we should do this for Defined::isec
+        // too...
+        auto *referentIsec = r.referent.get<InputSection *>();
+        r.referent = referentIsec->canonical();
         if (!r.pcrel)
           in.rebase->addEntry(isec, r.offset);
       }
     }
   }
+
+  in.unwindInfo->prepareRelocations();
 }
 
 void Writer::scanSymbols() {
@@ -892,28 +886,16 @@ template <class LP> void Writer::createOutputSections() {
   }
 
   // Then add input sections to output sections.
-  for (const auto &p : enumerate(inputSections)) {
-    InputSection *isec = p.value();
-    OutputSection *osec;
-    if (auto *concatIsec = dyn_cast<ConcatInputSection>(isec)) {
-      if (concatIsec->shouldOmitFromOutput())
-        continue;
-      NamePair names = maybeRenameSection({isec->segname, isec->name});
-      ConcatOutputSection *&concatOsec = concatOutputSections[names];
-      if (concatOsec == nullptr)
-        concatOsec = make<ConcatOutputSection>(names.second);
-      concatOsec->addInput(concatIsec);
-      osec = concatOsec;
-    } else if (auto *cStringIsec = dyn_cast<CStringInputSection>(isec)) {
-      in.cStringSection->addInput(cStringIsec);
-      osec = in.cStringSection;
-    } else if (auto *litIsec = dyn_cast<WordLiteralInputSection>(isec)) {
-      in.wordLiteralSection->addInput(litIsec);
-      osec = in.wordLiteralSection;
-    } else {
-      llvm_unreachable("unhandled InputSection type");
-    }
-    osec->inputOrder = std::min(osec->inputOrder, static_cast<int>(p.index()));
+  for (ConcatInputSection *isec : inputSections) {
+    if (isec->shouldOmitFromOutput())
+      continue;
+    NamePair names = maybeRenameSection({isec->segname, isec->name});
+    ConcatOutputSection *&osec = concatOutputSections[names];
+    if (!osec)
+      osec = make<ConcatOutputSection>(names.second);
+    osec->addInput(isec);
+    osec->inputOrder =
+        std::min(osec->inputOrder, static_cast<int>(isec->outSecOff));
   }
 
   // Once all the inputs are added, we can finalize the output section
@@ -921,12 +903,8 @@ template <class LP> void Writer::createOutputSections() {
   for (const auto &it : concatOutputSections) {
     StringRef segname = it.first.first;
     ConcatOutputSection *osec = it.second;
-    if (segname == segment_names::ld) {
-      assert(osec->name == section_names::compactUnwind);
-      in.unwindInfo->setCompactUnwindSection(osec);
-    } else {
-      getOrCreateOutputSegment(segname)->addOutputSection(osec);
-    }
+    assert(segname != segment_names::ld);
+    getOrCreateOutputSegment(segname)->addOutputSection(osec);
   }
 
   for (SyntheticSection *ssec : syntheticSections) {
@@ -946,57 +924,6 @@ template <class LP> void Writer::createOutputSections() {
   linkEditSegment = getOrCreateOutputSegment(segment_names::linkEdit);
 }
 
-void Writer::foldIdenticalLiterals() {
-  if (in.cStringSection)
-    in.cStringSection->finalizeContents();
-  // TODO: WordLiteralSection & CFStringSection should be finalized here too
-}
-
-void Writer::foldIdenticalSections() {
-  if (config->icfLevel == ICFLevel::none)
-    return;
-  ConcatOutputSection *textOutputSection = concatOutputSections.lookup(
-      maybeRenameSection({segment_names::text, section_names::text}));
-  if (textOutputSection == nullptr)
-    return;
-
-  TimeTraceScope timeScope("Fold Identical Code Sections");
-  // The ICF equivalence-class segregation algorithm relies on pre-computed
-  // hashes of InputSection::data for the ConcatOutputSection::inputs and all
-  // sections referenced by their relocs. We could recursively traverse the
-  // relocs to find every referenced InputSection, but that precludes easy
-  // parallelization. Therefore, we hash every InputSection here where we have
-  // them all accessible as a simple vector.
-  std::vector<ConcatInputSection *> hashable;
-  // If an InputSection is ineligible for ICF, we give it a unique ID to force
-  // it into an unfoldable singleton equivalence class.  Begin the unique-ID
-  // space at inputSections.size(), so that it will never intersect with
-  // equivalence-class IDs which begin at 0. Since hashes & unique IDs never
-  // coexist with equivalence-class IDs, this is not necessary, but might help
-  // someone keep the numbers straight in case we ever need to debug the
-  // ICF::segregate()
-  uint64_t icfUniqueID = inputSections.size();
-  for (InputSection *isec : inputSections) {
-    if (auto *concatIsec = dyn_cast<ConcatInputSection>(isec)) {
-      if (concatIsec->isHashableForICF(isec->parent == textOutputSection))
-        hashable.push_back(concatIsec);
-      else
-        concatIsec->icfEqClass[0] = ++icfUniqueID;
-    }
-  }
-  // FIXME: hash literal sections here too?
-  parallelForEach(hashable,
-                  [](ConcatInputSection *isec) { isec->hashForICF(); });
-  // Now that every input section is either hashed or marked as unique,
-  // run the segregation algorithm to detect foldable subsections
-  ICF(textOutputSection->inputs).run();
-  size_t oldSize = textOutputSection->inputs.size();
-  textOutputSection->eraseOmittedInputSections();
-  size_t newSize = textOutputSection->inputs.size();
-  log("ICF kept " + Twine(newSize) + " removed " + Twine(oldSize - newSize) +
-      " of " + Twine(oldSize));
-}
-
 void Writer::finalizeAddresses() {
   TimeTraceScope timeScope("Finalize addresses");
   uint64_t pageSize = target->getPageSize();
@@ -1128,10 +1055,6 @@ template <class LP> void Writer::run() {
     in.stubHelper->setup();
   scanSymbols();
   createOutputSections<LP>();
-  // ICF assumes that all literals have been folded already, so we must run
-  // foldIdenticalLiterals before foldIdenticalSections.
-  foldIdenticalLiterals();
-  foldIdenticalSections();
   // After this point, we create no new segments; HOWEVER, we might
   // yet create branch-range extension thunks for architectures whose
   // hardware call instructions have limited range, e.g., ARM(64).