[llvm-dev] Proposal for address-significance tables for --icf=safe

Wed May 23 07:44:59 PDT 2018

On Wed, May 23, 2018 at 12:06 AM, Peter Collingbourne via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> Context: ld.gold has an --icf=safe flag which is intended to apply ICF only
> to sections which can be safely merged according to the guarantees provided
> by the language. It works using a set of heuristics (symbol name matching
> and relocation scanning). That's not only imprecise but it only works with
> certain languages and is slow due to the need to demangle symbols and scan
> relocations. It's also redundant with the (local_)unnamed_addr analysis
> already performed by LLVM.
>
> I implemented an alternative to this approach in clang and lld. It works by
> adding a section to each object file containing the indexes of the symbols
> which are address-significant (i.e. not (local_)unnamed_addr in IR).
>
> I used this implementation to link clang with release+asserts with each of
> --icf={none,safe,all}. The binary sizes were:
>
> none: 109407184
> safe: 108534736 (-0.8%)
> all: 107281360 (-2%)
>
> I measured the object file overhead of these sections in my clang build at
> 0.08%. That's almost nothing, and I think it's small enough that we can turn
> it on by default.
>
> I've uploaded a patch series for this feature here:
> https://github.com/pcc/llvm-project/tree/llvm-addrsig
> I intend to start sending it for review soon.

Very nice!

I was going to ask how this plays with object files compiled with
something other than LLVM, but now I see you assume all symbols are
address-significant if there's no address-significance table in the
file. It all seems very sensible to me :-)