[llvm-dev] Proposal for address-significance tables for --icf=safe

Peter Collingbourne via llvm-dev llvm-dev at lists.llvm.org
Thu May 31 14:05:05 PDT 2018


The hash approach was suggested by others as well, but I think for now we
can use the sh_link field until the tools are updated -- that was the
recommended approach on the generic-abi thread as well. Keep in mind that
updating the gABI is really orthogonal to the compatibility issue: even
with an updated gABI we'd still have the practical problem of needing to
deal with old tools somehow.

Agreed that we'd be fine pessimising ld -r -- the main thing that I'm
concerned about is avoiding miscompiles caused by shuffling symbols.

Peter

On Thu, May 31, 2018 at 1:52 PM, bd1976 llvm <bd1976llvm at gmail.com> wrote:

> Hi Peter, This is a great proposal, thanks!.
>
> If you were worried about making the abi change have you
> thought about just going for an array of symbol names
> or hashes of symbol names where any matching symbol is
> considered address significant? This would sidestep the
> problem of keeping the symbol table indices in sync.
>
> It would be pessimistic for local symbols if the input
> SHT_ADDRSIG sections were combined by e.g. "ld -r" but
> in my experience this should not have too much of an
> impact on --icf.
>
> Might be worth considering in the short term whilst you
> work on getting gabi acceptance.
>
> On Tue, May 22, 2018 at 11:06 PM, Peter Collingbourne via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi all,
>>
>> Context: ld.gold has an --icf=safe flag which is intended to apply ICF
>> only to sections which can be safely merged according to the guarantees
>> provided by the language. It works using a set of heuristics (symbol name
>> matching and relocation scanning). That's not only imprecise but it only
>> works with certain languages and is slow due to the need to demangle
>> symbols and scan relocations. It's also redundant with the
>> (local_)unnamed_addr analysis already performed by LLVM.
>>
>> I implemented an alternative to this approach in clang and lld. It works
>> by adding a section to each object file containing the indexes of the
>> symbols which are address-significant (i.e. not (local_)unnamed_addr in IR).
>>
>> I used this implementation to link clang with release+asserts with each
>> of --icf={none,safe,all}. The binary sizes were:
>>
>> none: 109407184
>> safe: 108534736 (-0.8%)
>> all: 107281360 (-2%)
>>
>> I measured the object file overhead of these sections in my clang build
>> at 0.08%. That's almost nothing, and I think it's small enough that we can
>> turn it on by default.
>>
>> I've uploaded a patch series for this feature here:
>> https://github.com/pcc/llvm-project/tree/llvm-addrsig
>> I intend to start sending it for review soon.
>>
>> Thanks,
>> --
>> --
>> Peter
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>


-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180531/c59bda9f/attachment.html>


More information about the llvm-dev mailing list