[llvm-dev] [RFC] Annotating global functions and variables to prevent ICF during linking
Reid Kleckner via llvm-dev
llvm-dev at lists.llvm.org
Thu Mar 25 10:55:23 PDT 2021
This is a bit of a hot take, but I'd rather make aggressive ICF (fold any
identical code) the norm and scrap the C++ language guarantee about
function pointer identity. I don't participate in C++ standardization, so
my opinion doesn't carry much weight, but I lean towards giving the
optimizer a free hand here. There are a number of other ways in which
function pointer identity breaks down. Consider
-fvisibility-inlines=hidden, which if we are being real, is the only way to
compile big C++ projects, even those with explicit visibility annotations.
Function pointer identity is just fragile and easy to break, and I don't
encourage people to rely on it. If you need some kind of globally unique
identifier, it's pretty straightforward to use data instead of code.
On Tue, Mar 23, 2021 at 2:55 PM James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> In general, I don't like the "icf=all" mode. It's not nice to have the
> linker break semantics of the program, and I think we should not encourage
> its use. (And, in particular, adding new features just for it seems like a
> bad idea.) Instead, we should make sure that icf=safe works as well as it
> possibly can, and encourage its use.
>
> It's not possible to conclude anything based only on the size results, but
> it does indicate that some further investigation may be a good idea, to
> look into *why* icf=safe is apparently less-effective on the windows
> binary vs the linux one.
>
>
> On Tue, Mar 23, 2021 at 2:50 PM Fangrui Song via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>> On 2021-03-23, Zequan Wu wrote:
>> >The size increase of chrome on Linux by switching from all ICF to safe
>> ICF
>> >is small.
>> >All ICF:
>> > text data bss dec hex filename
>> >169314343 8472660 2368965 180155968 abcf640 chrome
>> >Safe ICF:
>> > text data bss dec hex filename
>> >174521550 8497604 2368965 185388119 b0ccc57 chrome
>> >
>> >On Windows, chrome.dll increases size by around 14 MB (12MB increases in
>> >.text section).
>> >All ICF:
>> >Size of out\Default\chrome.dll is 170.715648 MB
>> > name: mem size , disk size
>> > .text: 141.701417 MB
>> > .rdata: 22.458476 MB
>> > .data: 3.093948 MB, 0.523264 MB
>> > .pdata: 4.412364 MB
>> > .00cfg: 0.000040 MB
>> > .gehcont: 0.000132 MB
>> > .retplne: 0.000108 MB
>> > .rodata: 0.004544 MB
>> > .tls: 0.000561 MB
>> > CPADinfo: 0.000056 MB
>> > _RDATA: 0.000244 MB
>> > .rsrc: 0.285232 MB
>> > .reloc: 1.324196 MB
>> >Safe ICF:
>> >Size of out\icf-safe\chrome.dll is 184.499712 MB
>> > name: mem size , disk size
>> > .text: 153.809529 MB
>> > .rdata: 23.123628 MB
>> > .data: 3.093948 MB, 0.523264 MB
>> > .pdata: 5.367396 MB
>> > .00cfg: 0.000040 MB
>> > .gehcont: 0.000132 MB
>> > .retplne: 0.000108 MB
>> > .rodata: 0.004544 MB
>> > .tls: 0.000561 MB
>> > CPADinfo: 0.000056 MB
>> > _RDATA: 0.000244 MB
>> > .rsrc: 0.285232 MB
>> > .reloc: 1.379364 MB
>> >
>> >If an attribute is used and it affects unnamed_addr of a symbol, it
>> >determines whether the symbols should show up in the .addrsig table.
>> >All-ICF mode in ld.lld and lld-link ignore symbols in the .addrsig table,
>> >if they belong to code sections. So, it won't have an effect on disabling
>> >ICF.
>>
>> Is there something which can be improved on lld-link? Note also that
>> the size difference between gold --icf={safe,all} is smaller than the
>> difference between ld.lld --icf={safe,all}. Some may be due to gold
>> performing "unsafe" safe ICF, some may suggest places where clang fails
>> to add {,local_}unnamed_addr.
>>
>>
>> >On Mon, Mar 22, 2021 at 10:19 PM Fangrui Song <maskray at google.com>
>> wrote:
>> >
>> >>
>> >> On 2021-03-22, David Blaikie via llvm-dev wrote:
>> >> >ICF: Identical Code Folding
>> >> >
>> >> >Linker deduplicates functions by collapsing any identical functions
>> >> >together - with icf=safe, the linker looks at a .addressing section
>> in the
>> >> >object file and any functions listed in that section are not treated
>> as
>> >> >collapsible (eg: because they need to meet C++'s "distinct functions
>> have
>> >> >distinct addresses" guarantee)
>> >>
>> >> The name originated from MSVC link.exe where icf stands for "identical
>> >> COMDAT folding".
>> >> gold named it "identical code folding" - which makes some sense because
>> >> gold does not fold readonly data.
>> >>
>> >> In LLD, the name is not accurate for two reasons: (1) the feature can
>> >> apply to readonly data as well; (2) the folding is by section, not by
>> >> function.
>> >>
>> >> We define identical sections as they have identical content and their
>> >> outgoing relocation sets cannot be distinguished: they need to have the
>> >> same number of relocations, with the same relative locations, with the
>> >> referenced symbols indistinguishable.
>> >>
>> >> Then, ld.lld --icf={safe,all} works like this:
>> >>
>> >> For a set of identical sections, the linker picks one representative
>> and
>> >> drops the rest, then redirects references to the representative.
>> >>
>> >> Note: this can confuse debuggers/symbolizers/profilers easily.
>> >>
>> >> lld-link /opt:icf is different from ld.lld --icf but I haven't looked
>> >> into it closely.
>> >>
>> >>
>> >> I find that the feature's saving is small given its downside
>> >> (also increaded link time: the current LLD's implementation is
>> inferior:
>> >> it performs a quadratic number of comparisons among an equality class):
>> >>
>> >> This is the size differences for the 'lld' executable:
>> >>
>> >> % size lld.{none,safe,all}
>> >> text data bss dec hex filename
>> >> 96821040 7210504 550810 104582354 63bccd2 lld.none
>> >> 95217624 7167656 550810 102936090 622ae1a lld.safe
>> >> 94038808 7167144 550810 101756762 610af5a lld.all
>> >> % size gold.{none,safe,all}
>> >> text data bss dec hex filename
>> >> 96857302 7174792 550825 104582919 63bcf07 gold.none
>> >> 94469390 7174792 550825 102195007 6175f3f gold.safe
>> >> 94184430 7174792 550825 101910047 613061f gold.all
>> >>
>> >> Note that the --icf=all result caps the potential saving of the
>> proposed
>> >> annotation.
>> >>
>> >> Actually with some large internal targets I get even smaller savings.
>> >>
>> >>
>> >> ld.lld --icf=safe is safer than gold --icf=safe but probably misses
>> some
>> >> opportunities.
>> >> It can be that clang codegen/optimizer fail to mark some cases as
>> >> {,local_}unnamed_addr.
>> >>
>> >> I know Chromium and the Windows world can be different:) But I'd still
>> >> want to
>> >> get some numbers first.
>> >>
>> >>
>> >> Last, I have seen that Chromium has some code like
>> >>
>> >>
>> https://source.chromium.org/chromium/chromium/src/+/master:skia/ext/SkMemory_new_handler.cpp
>> >>
>> >> void sk_abort_no_print() {
>> >> // Linker's ICF feature may merge this function with other
>> >> functions with
>> >> // the same definition (e.g. any function whose sole job is to
>> call
>> >> abort())
>> >> // and it may confuse the crash report processing system.
>> >> // http://crbug.com/860850
>> >> static int static_variable_to_make_this_function_unique =
>> 0x736b;
>> >> // "sk"
>> >>
>> base::debug::Alias(&static_variable_to_make_this_function_unique);
>> >>
>> >> abort();
>> >> }
>> >>
>> >> If we want an approach to work with link.exe, I don't know what we can
>> >> do...
>> >> If no desire for link.exe compatibility, I can see that having a proper
>> >> way marking the function
>> >> can be useful... but in any case if an attribute is used, it probably
>> >> should affect
>> >> unnamed_addr directly instead of being called *icf*.
>> >>
>> >>
>> >>
>> >> >On Mon, Mar 22, 2021 at 6:16 PM Philip Reames via llvm-dev <
>> >> >llvm-dev at lists.llvm.org> wrote:
>> >> >
>> >> >> Can you define ICF please? And give a bit of context?
>> >> >>
>> >> >> Philip
>> >> >> On 3/22/21 5:27 PM, Zequan Wu via llvm-dev wrote:
>> >> >>
>> >> >> Hi all,
>> >> >>
>> >> >> Background:
>> >> >> It's been a longstanding difficulty of debugging with ICF.
>> Programmers
>> >> >> don't have control over which sections should be folded by ICF,
>> which
>> >> >> sections shouldn't. The existing address significant table won't
>> have
>> >> >> effect for code sections during all ICF mode in both ld.lld and
>> >> lld-link.
>> >> >> By switching to safe ICF could mark code sections as unique, but at
>> a
>> >> cost
>> >> >> of increasing binary size out of control. So, it would be good if
>> >> >> programmers could selectively disable ICF in source code by
>> annotating
>> >> >> global functions/variables with an attribute to improve debugging
>> >> >> experience and have the control on the binary size increase.
>> >> >>
>> >> >> My plan is to add a new section table(`.no_icf`) to object files.
>> >> Sections
>> >> >> of all symbols inside the table should not be folded by all ICF
>> mode.
>> >> And
>> >> >> symbols can only be added into the table by annotating global
>> >> >> functions/variables with a new attribute(`no_icf`) in source code.
>> >> >>
>> >> >> What do you think about this approach?
>> >> >>
>> >> >> Thanks,
>> >> >> Zequan
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://
>> >> lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>
>> >> >> _______________________________________________
>> >> >> LLVM Developers mailing list
>> >> >> llvm-dev at lists.llvm.org
>> >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>
>> >>
>> >> >_______________________________________________
>> >> >LLVM Developers mailing list
>> >> >llvm-dev at lists.llvm.org
>> >> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>
>> >>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210325/0ff6ae1d/attachment.html>
More information about the llvm-dev
mailing list