[PATCH] D109946: [lld-macho] Teach ICF to dedup functions with identical unwind info

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 16 20:37:35 PDT 2021


int3 created this revision.
int3 added a reviewer: lld-macho.
Herald added a reviewer: gkm.
Herald added a project: lld-macho.
int3 requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on `relocateCompactUnwind()` to correctly handle
these truncated input sections.

Here are the numbers before and after D109944 <https://reviews.llvm.org/D109944>, D109945 <https://reviews.llvm.org/D109945>, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:

Without any optimizations:

             base           diff           difference (95% CI)
  sys_time   0.849 ± 0.015  0.896 ± 0.012  [  +4.8% ..   +6.2%]
  user_time  3.357 ± 0.030  3.512 ± 0.023  [  +4.3% ..   +5.0%]
  wall_time  3.944 ± 0.039  4.032 ± 0.031  [  +1.8% ..   +2.6%]
  samples    40             38

With `-dead_strip`:

             base           diff           difference (95% CI)
  sys_time   0.847 ± 0.010  0.896 ± 0.012  [  +5.2% ..   +6.5%]
  user_time  3.377 ± 0.014  3.532 ± 0.015  [  +4.4% ..   +4.8%]
  wall_time  3.962 ± 0.024  4.060 ± 0.030  [  +2.1% ..   +2.8%]
  samples    47             30

With `-dead_strip` and `--icf=all`:

             base           diff           difference (95% CI)
  sys_time   0.935 ± 0.013  0.957 ± 0.018  [  +1.5% ..   +3.2%]
  user_time  3.472 ± 0.022  6.531 ± 0.046  [ +87.6% ..  +88.7%]
  wall_time  4.080 ± 0.040  5.329 ± 0.060  [ +30.0% ..  +31.2%]
  samples    37             30

Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.

Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109946

Files:
  lld/MachO/ICF.cpp
  lld/MachO/InputFiles.cpp
  lld/MachO/UnwindInfoSection.cpp
  lld/test/MachO/icf.s
  lld/test/MachO/invalid/compact-unwind-bad-reloc.s

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109946.373132.patch
Type: text/x-patch
Size: 9730 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210917/e92c7341/attachment-0001.bin>


More information about the llvm-commits mailing list