[PATCH] D102964: [lld-macho] Implement cstring deduplication

Nico Weber via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 3 12:00:38 PDT 2021


thakis added a comment.

A random review might not be the best place for this, but:

1. It's great we're looking at size of the output binary!

2. Maybe it makes sense to look for lower-hanging fruit before implementing somewhat expensive things?

Here's a bloaty (https://github.com/google/bloaty) diff between ld64-linked Chromium Framework (after `--`, that's where the old version goes) and lld-linked Chromium Framework:

  % ~/src/bloaty/bloaty 'Chromium Framework' --  'Chromium.app/Contents/Frameworks/Chromium Framework.framework/Versions/Current/Chromium Framework'
      FILE SIZE        VM SIZE
   --------------  --------------
    [NEW] +6.98Mi  [NEW] +6.98Mi    __DATA_CONST,__const
   +36e2% +6.42Mi +36e2% +6.42Mi    Rebase Info
     +62% +3.19Mi   +62% +3.19Mi    __TEXT,__cstring
    [NEW] +1.52Mi  [NEW] +1.52Mi    __TEXT,__literal16
    +139% +1.11Mi  +139% +1.11Mi    Function Start Addresses
    +1.5%  +940Ki  +1.5%  +940Ki    String Table
   +38e2%  +530Ki +38e2%  +530Ki    __TEXT,__eh_frame
     +27%  +124Ki   +27%  +124Ki    __TEXT,__objc_methtype
    +102%  +118Ki  +102%  +118Ki    __TEXT,__objc_methname
    +1.1%  +114Ki  +1.1%  +114Ki    Symbol Table
    [NEW] +88.0Ki  [NEW] +88.0Ki    __TEXT,__literal8
    +263% +80.9Ki  +263% +80.9Ki    Binding Info
    [NEW] +62.3Ki  [NEW] +62.3Ki    __TEXT,__literal4
    [NEW] +32.4Ki  [NEW] +32.4Ki    __DATA_CONST,__cfstring
    +136% +28.2Ki  +136% +28.2Ki    __DATA,__objc_selrefs
    +0.0% +8.38Ki  -0.0% -6.59Ki    [31 Others]
    [DEL] -26.8Ki  [DEL] -26.8Ki    __DATA,__cfstring
    [DEL] -53.4Ki  [DEL] -53.4Ki    Table of Non-instructions
   -30.7% -57.9Ki -30.7% -57.9Ki    __DATA,__objc_const
    -1.0% -83.3Ki  -1.0% -83.3Ki    __TEXT,__const
    [DEL] -6.98Mi  [DEL] -6.98Mi    __DATA,__const
    +6.0% +14.1Mi  +5.9% +14.1Mi    TOTAL

`__cstring` is indeed on the list, but there are other things before it. The _DATA_CONST looks like it's just in __DATA in ld64 and just moved around (see 2nd-to-last line), but our rebase info and LC_FUNCTION_STARTS sections are way larger and possibly easier to fix (LC_FUNCTION_STARTS is 2003880 vs 838208 -- that's 1.2 MB that are likely a cheap fix).

3% smaller is great, but 2% slower isn't exactly cheap. It's not super expensive either, but 10% here and 10% there and suddenly you take twice as long. Being much faster is one of the big selling points of lld so we should try hard not to regress on that. Several thoughts on that:

1. Maybe this should be opt in (only at -O2, and/or lto or what)? People who really want optimized binaries over link time probably do (thin) LTO.
2. If the main cost is the hash:
  1. That should parallelize well
  2. Is there some way we could compute the string hash at compile time and stash it somewhere? (Similar in idea to http://blog.llvm.org/2018/01/improving-link-time-on-windows-with.html)

(Some of this also applies to the ICF patch – looks like we're doing a more thorough job than ld64 with ICF but it's also more expensive.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102964/new/

https://reviews.llvm.org/D102964



More information about the llvm-commits mailing list