[PATCH] D97446: Change some addUsedGlobal to addUsedOrCompilerUsedGlobal

Fangrui Song via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 13 12:39:28 PST 2022


MaskRay added a comment.

In D97446#3241416 <https://reviews.llvm.org/D97446#3241416>, @JonChesterfield wrote:

> It looks deeply wrong to be marking globals as either llvm.used or llvm.compiler.used based on the output file format. It should be based on the (purpose of) the global.
>
> In D97446#3241142 <https://reviews.llvm.org/D97446#3241142>, @rnk wrote:
>
>> This change was implemented so that llvm.used could prevent section GC on ELF, to make its semantics consistent with llvm.used on COFF and MachO. Essentially, llvm.used behaves differently on ELF now so to prevent behavior changes for users of `__attribute__((used))`, it was migrated to `llvm.compiler.used` on ELF targets. This is consistent with GCC's behavior.
>
> Is this sentence inverted? llvm.used should prevent sections from being discarded. If it doesn't at present, that's a bug in the linker.

I agree with rnk.
The bug can be on the compiler side which does not give enough information to the linker.
Before `retain` was added to GCC and binutils supported SHF_GNU_RETAIN, the ELF world was in that state.

> llvm.compiler.used should generally be discarded by whatever part of the compiler wanted the variable

The compiler cannot discard it per LangRef https://llvm.org/docs/LangRef.html#the-llvm-compiler-used-global-variable

> but if it makes it to the linker, the linker should throw it away. Because it was only used by the compiler. It's possible some users marked things as 'used' but wanted them thrown away, but it seems more likely that users weren't using gc-sections if it broke their application by throwing away things they asked to keep.

Correct.

>> This should only change behavior for you if you depend on the details of the LLVM IR being emitted, or perhaps if you use LTO, where GlobalOpt will behave differently. I don't think these use cases are as important as consistent ELF section GC behavior.
>>
>> So, apologies for changing the LLVM IR emitted for this attribute, but I think it's unlikely we will change our minds and revert this a year later.
>
> I'm hopeful I can change your minds now. The current modelling in the compiler doesn't match the stated intent.
>
> In D97446#3241195 <https://reviews.llvm.org/D97446#3241195>, @MaskRay wrote:
>
>> llvm.compiler.used hasn't been changed.
>
> True, but attribute((used)) has. I only noticed this patch because it broke some test cases in rocm, but it'll break some of my own code too. Moving a variable from llvm.used to llvm.compiler.used changes whether internalise skips the variable.

Modulo optimizer bugs, `__attribute__((used))` hasn't changed semantics.
If your downstream project does not handle llvm.compiler.used, maybe handle it now :)

I apologize that previous Clang probably very rarely emitted llvm.compiler.used and it started to do it more often now.

>> The text focuses on the semantics, and for practical reasons refers to the toolchain support.
>> Before GCC 11/binutils 2.36 there was just no portable way making a definition not discarded by ld --gc-sections.
>
> Sure there was. Don't pass gc-sections to the linker, or don't compile with ffunction-sections to get a close approximation.
>
>>> edit: Would making attribute((used)) imply attribute((retain)) on elf targets achieve the objective of this patch without breaking code that expects 'used' to mean "don't throw this away"?
>>
>> This would make semantics less orthogonal and incompatible with GCC.
>> On COFF and Mach-O, there have been Clang-specific (not GCC) use cases relying on attribute((used)) implying GC roots.
>> On ELF, there was none before the toolchain support.
>>
>> Every llvm.used usage I can find in the wild does intend to have the GC semantics, and not having it on ELF was actually a bug and has been fixed by the patch series.
>
> I'm absolutely sure that people mark things as attribute((used)) to stop the toolchain discarding them. I think we're in agreement there, but differ in our assessment of popularity of gc-sections.
>
> Are we missing a category here?
>
> llvm.compiler.used <- the compiler uses the global, and may discard it. If it doesn't, the linker should discard it
> llvm.linker.used <- the linker uses this global, and may discard it. The compiler should leave it alone aside from passing it to the linker
> llvm.used <- some unspecified thing uses the global, the compiler and linker should leave it alone aside from embedding it in the linked output

I think the llvm.compiler.used interpretation diverges from LangRef and how we teach optimizers to respect llvm.compiler.used.
Then llvm.linker.used shall not be needed.

> If we have to map 'attribute((used))' onto the new llvm.linker.used and 'attribute((retain))' onto llvm.used that's a shame, but at least it keeps the naming weirdness localised to the language front end.

I have checked the documentation of many GCC releases (including very ancient ones).
Its `__attribute__((used))` never suggests that it has the linker GC semantics.
In LLVM/Clang, Mach-O somehow chose to overload `used` with the additional linker dead stripping semantics.

The ideal semantics is that COFF/Mach-O uses llvm.compiler.used for `__attribute__((used))` as well.
But downgrading llvm.used to llvm.compiler.used may be regression for some code, so I chose not to do that.

I agree that there is unfortunate binary format inconsistency, but the ELF semantics as implemented in the patch series are ideal.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97446/new/

https://reviews.llvm.org/D97446



More information about the cfe-commits mailing list