[llvm-dev] __attribute__((retain)) && llvm.used/llvm.compiler.used

Fāng-ruì Sòng via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 24 01:09:09 PST 2021


On 2021-02-24, Fāng-ruì Sòng wrote:
>Currently __attribute__((used)) lowers to llvm.used.
>
>* On Mach-O, a GlobalObject in llvm.used gets the S_ATTR_NO_DEAD_STRIP
>attribute, which prevents linker GC (dead stripping).
>* On COFF, a non-local-linkage GlobalObject[1] in llvm.used gets the
>/INCLUDE: linker option (similar to ELF `ld -u`), which prevents
>linker GC.
>  It should be possible to work with local linkage GlobalObject's as
>well but that will require a complex COMDAT dance.
>* On ELF, a global object llvm.used can be discarded by
>ld.bfd/gold/ld.lld --gc-sections.
>  (If the section is a C identifier name, __start_/__stop_ relocations
>from a live input section can retain the section, even if its defined
>symbols are not referenced. [2] .
>  I understand that some folks use `__attribute__((used,
>section("C_ident")))` and expect the sections to be similar to GC
>roots, however,
>  non-C-identifier cases are very common, too. They don't get
>__start_/__stop_ linker magic and the sections can always be GCed.
>  )
>
>In LangRef, the description of llvm.used contains:
>
>> If a symbol appears in the @llvm.used list, then the compiler, assembler, and **linker** are required to treat the symbol as if there is a reference to the symbol that it cannot see (which is why they have to be named). For example, if a variable has internal linkage and no references other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.
>
>Note that the "linker" part does not match the reality on ELF targets.
>It does match the reality on Mach-O and partially on COFF.
>
>llvm.compiler.used:
>
>> The @llvm.compiler.used directive is the same as the @llvm.used directive, except that it only prevents the compiler from touching the symbol. On targets that support it, this allows an **intelligent linker to optimize references to the symbol without being impeded** as it would be by @llvm.used.
>
>Note that this explicitly mentions linker GC, so this appears to be
>the closest thing to __attribute__((used)) on ELF.
>However, LangRef also says:
>
>> This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages.
>
>
>
>My goal is to implement __attribute__((retain)) (which will be in GCC
>11) on ELF. GCC folks think that 'used' and 'retain are orthogonal.
>(see https://reviews.llvm.org/D96838#2578127)
>
>Shall we
>
>1. Lift the source language restriction on llvm.compiler.used and
>change __attribute__((used)) to use llvm.compiler.used on ELF.

It is too late here and I did not think of it clearly;-)

Clarify:

1. Lift the source language restriction on llvm.compiler.used, let llvm.used use SHF_GNU_RETAIN on ELF, and change __attribute__((used)) to use llvm.compiler.used on ELF.


__attribute__((retain)) has semantics which are not described by
llvm.used/llvm.compiler.used.  To facilitate linker GC, __attribute__((retain))
causes the section to be placed in a unique section. The separate section
behavior can be undesired in some cases (e.g. poorly written Linux kernel linker
scripts which expect one section per name).

So in the -fno-function-sections -fno-data-sections case, a retained
function/variable does not cause the whole .text/.data/.rodata to be retained.

The test llvm/test/CodeGen/X86/elf-retain.ll in https://reviews.llvm.org/D96837
demonstrates the behavior.  So I am not particularly clear that we should use
llvm.compiler.used/llvm.used to describe __attribute__((retain)) .

>2. Or add a metadata (like https://reviews.llvm.org/D96837)?
>
>
>I lean to option 1 to leverage the existing mechanism.
>The downside is that clang codegen will have some target inconsistency
>(llvm.compiler.used on ELF while llvm.used on others).
>
>
>
>[1]: The implementation additionally allows GlobalAlias.
>[2]: See https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
>"C identifier name sections" for details.


More information about the llvm-dev mailing list