[cfe-dev] __attribute__((retain)) && llvm.used/llvm.compiler.used
Fāng-ruì Sòng via cfe-dev
cfe-dev at lists.llvm.org
Wed Feb 24 23:56:43 PST 2021
On Wed, Feb 24, 2021 at 11:03 AM Fāng-ruì Sòng <maskray at google.com> wrote:
>
> On Wed, Feb 24, 2021 at 10:40 AM David Blaikie <dblaikie at gmail.com> wrote:
> >
> > (best to include folks from previous conversations in threads - sometimes we can't all keep up to date with all the threads happening - so I've added John McCall here, and echristo since he might have some thoughts on this too)
> >
> > I'd lean towards (1) too myself - give the LLVM constructs consistent semantics, and deal with the platform differences in the frontend during the mapping down to LLVM.
>
> I chatted with Saleem Abdulrasool, who is in favor of (1), too.
>
> I am going to send these patches:
Implemented the idea. Sent:
> (a) Add CodeGenModule::addUsedOrCompilerUsedGlobal (which uses
> llvm.compiler.used for ELF and llvm.used for the others). Migrate some
> addUsedGlobal call sites to use addUsedOrCompilerUsedGlobal.
https://reviews.llvm.org/D97446
> (b) Add __attribute__((retain))
https://reviews.llvm.org/D97447
> (c) Change llvm.used to use SHF_GNU_RETAIN if integrated assembler or
> binutils>=2.36
https://reviews.llvm.org/D97448
> Currently llvm.used/llvm.compiler.used should have no difference on
> ELF, so (a) & (b) do not affect users who don't use 'retain'.
>
> (c) will change the binary format representation of llvm.used, so
> there is some risk if the consumer is not prepared for multiple
> sections of the same name (which means they already break with
> -fno-unique-section-names, but the option is rare).
> On very large C/C++ projects, llvm.used has usually 0 or 1 element.
> ObjC can have multiple llvm.used but that should work. So if there is
> risk, the risk for other frontends.
> I don't see a way to avoid that, but they can switch to llvm.compiler.used.
>
> Non-ELF users should not observe anything different.
> > On Wed, Feb 24, 2021 at 1:09 AM Fāng-ruì Sòng via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> >>
> >> On 2021-02-24, Fāng-ruì Sòng wrote:
> >> >Currently __attribute__((used)) lowers to llvm.used.
> >> >
> >> >* On Mach-O, a GlobalObject in llvm.used gets the S_ATTR_NO_DEAD_STRIP
> >> >attribute, which prevents linker GC (dead stripping).
> >> >* On COFF, a non-local-linkage GlobalObject[1] in llvm.used gets the
> >> >/INCLUDE: linker option (similar to ELF `ld -u`), which prevents
> >> >linker GC.
> >> > It should be possible to work with local linkage GlobalObject's as
> >> >well but that will require a complex COMDAT dance.
> >> >* On ELF, a global object llvm.used can be discarded by
> >> >ld.bfd/gold/ld.lld --gc-sections.
> >> > (If the section is a C identifier name, __start_/__stop_ relocations
> >> >from a live input section can retain the section, even if its defined
> >> >symbols are not referenced. [2] .
> >> > I understand that some folks use `__attribute__((used,
> >> >section("C_ident")))` and expect the sections to be similar to GC
> >> >roots, however,
> >> > non-C-identifier cases are very common, too. They don't get
> >> >__start_/__stop_ linker magic and the sections can always be GCed.
> >> > )
> >> >
> >> >In LangRef, the description of llvm.used contains:
> >> >
> >> >> If a symbol appears in the @llvm.used list, then the compiler, assembler, and **linker** are required to treat the symbol as if there is a reference to the symbol that it cannot see (which is why they have to be named). For example, if a variable has internal linkage and no references other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.
> >> >
> >> >Note that the "linker" part does not match the reality on ELF targets.
> >> >It does match the reality on Mach-O and partially on COFF.
> >> >
> >> >llvm.compiler.used:
> >> >
> >> >> The @llvm.compiler.used directive is the same as the @llvm.used directive, except that it only prevents the compiler from touching the symbol. On targets that support it, this allows an **intelligent linker to optimize references to the symbol without being impeded** as it would be by @llvm.used.
> >> >
> >> >Note that this explicitly mentions linker GC, so this appears to be
> >> >the closest thing to __attribute__((used)) on ELF.
> >> >However, LangRef also says:
> >> >
> >> >> This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages.
> >> >
> >> >
> >> >
> >> >My goal is to implement __attribute__((retain)) (which will be in GCC
> >> >11) on ELF. GCC folks think that 'used' and 'retain are orthogonal.
> >> >(see https://reviews.llvm.org/D96838#2578127)
> >> >
> >> >Shall we
> >> >
> >> >1. Lift the source language restriction on llvm.compiler.used and
> >> >change __attribute__((used)) to use llvm.compiler.used on ELF.
> >>
> >> It is too late here and I did not think of it clearly;-)
> >>
> >> Clarify:
> >>
> >> 1. Lift the source language restriction on llvm.compiler.used, let llvm.used use SHF_GNU_RETAIN on ELF, and change __attribute__((used)) to use llvm.compiler.used on ELF.
> >>
> >>
> >> __attribute__((retain)) has semantics which are not described by
> >> llvm.used/llvm.compiler.used. To facilitate linker GC, __attribute__((retain))
> >> causes the section to be placed in a unique section. The separate section
> >> behavior can be undesired in some cases (e.g. poorly written Linux kernel linker
> >> scripts which expect one section per name).
> >>
> >> So in the -fno-function-sections -fno-data-sections case, a retained
> >> function/variable does not cause the whole .text/.data/.rodata to be retained.
> >>
> >> The test llvm/test/CodeGen/X86/elf-retain.ll in https://reviews.llvm.org/D96837
> >> demonstrates the behavior. So I am not particularly clear that we should use
> >> llvm.compiler.used/llvm.used to describe __attribute__((retain)) .
> >>
> >> >2. Or add a metadata (like https://reviews.llvm.org/D96837)?
> >> >
> >> >
> >> >I lean to option 1 to leverage the existing mechanism.
> >> >The downside is that clang codegen will have some target inconsistency
> >> >(llvm.compiler.used on ELF while llvm.used on others).
> >> >
> >> >
> >> >
> >> >[1]: The implementation additionally allows GlobalAlias.
> >> >[2]: See https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
> >> >"C identifier name sections" for details.
> >> _______________________________________________
> >> cfe-dev mailing list
> >> cfe-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
> --
> 宋方睿
--
宋方睿
More information about the cfe-dev
mailing list