[all-commits] [llvm/llvm-project] 8ec103: [lld-macho] Deduplicate CFStrings during ICF

Jez Ng via All-commits all-commits at lists.llvm.org
Tue Mar 8 05:34:30 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 8ec10339330011baf22551ee59f2397ff3f1f499
      https://github.com/llvm/llvm-project/commit/8ec10339330011baf22551ee59f2397ff3f1f499
  Author: Jez Ng <jezng at fb.com>
  Date:   2022-03-08 (Tue, 08 Mar 2022)

  Changed paths:
    M lld/MachO/ICF.cpp
    M lld/test/MachO/cfstring-dedup.s
    A lld/test/MachO/icf-undef.s
    M lld/test/MachO/icf.s

  Log Message:
  -----------
  [lld-macho] Deduplicate CFStrings during ICF

`__cfstring` has embedded addends that foil ICF's hashing / equality
checks. (We can ignore embedded addends when doing ICF because the same
information gets recorded in our Reloc structs.) Therefore, in order to
properly dedup CFStrings, we create a mutable copy of the CFString and
zero out the embedded addends before performing any hashing / equality
checks.

(We did in fact have a partial implementation of CFString deduplication
already. However, it only worked when the cstrings they point to are at
identical offsets in their object files.)

I anticipate this approach can be extended to other similar
statically-allocated struct sections in the future.

In addition, we previously treated all references with differing addends
as unequal. This is not true when the references are to literals:
different addends may point to the same literal in the output binary. In
particular, `__cfstring` has such references to `__cstring`. I've
adjusted ICF's `equalsConstant` logic accordingly, and I've added a few
more tests to make sure the addend-comparison code path is adequately
covered.

Fixes https://github.com/llvm/llvm-project/issues/51281.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D120137


  Commit: ce2ae381246df89e560c0dfd0a7fdf275f266d9e
      https://github.com/llvm/llvm-project/commit/ce2ae381246df89e560c0dfd0a7fdf275f266d9e
  Author: Jez Ng <jezng at fb.com>
  Date:   2022-03-08 (Tue, 08 Mar 2022)

  Changed paths:
    M lld/MachO/ICF.cpp
    M lld/MachO/InputFiles.cpp
    M lld/MachO/InputSection.cpp
    M lld/MachO/InputSection.h
    A lld/test/MachO/objc-classrefs-dedup.s

  Log Message:
  -----------
  [lld-macho] Deduplicate the `__objc_classrefs` section contents

ld64 breaks down `__objc_classrefs` on a per-word level and deduplicates
them. This greatly reduces the number of bind entries emitted (and
therefore the amount of work `dyld` has to do at runtime). For
chromium_framework, this change to LLD cuts the number of (non-lazy)
binds from 912 to 190, getting us to parity with ld64 in this aspect.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D121053


Compare: https://github.com/llvm/llvm-project/compare/d0aa77440c46...ce2ae381246d


More information about the All-commits mailing list