[PATCH] D53234: Don't remove COMDATs when internalizing global objects

ben via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 31 10:35:19 PDT 2018


bd1976llvm added a comment.

In https://reviews.llvm.org/D53234#1266444, @Aaron1011 wrote:

> > So perhaps this code needs to handle comdat containing an object with associated metadata specially (e.g. could just include in the ExternalComdats set)?
>
> My concern is that the current behavior would still be incorrect for other uses of COMDATS - or at the very least, quite counterintuitive.
>  If I specify a COMDAT for two global objects, I would expect one of the following things to happen:
>
> 1. Both objects appear in the final object, both still in the COMDAT.
> 2. Neither object appears in the final object, and the COMDAT section does not exist (both objects were optimized out).
>
>   Having only one of the two objects be present, and having it //lack// a COMDAT, seems very unexpected to me - especially considering that the COMDAT section can be read using external tools (e.g. `readelf`).


I think that there is a little confusion here. My understanding is:

1. The internalize pass is only used for LTO - where the compiler has a global view of the whole program.
2. The comdat mechanism (in ELF at least) is designed to handle C++ templates and had the explicit design goal of avoiding requiring a global view of the whole program.

Intuitively, we would not expect comdats in LTO object files as they are not required. LTO can remove the comdats (similarly, so can the system linker). In fact, lld actually ignores comdat groups in the LTO object files (see, the rather misnamed, addCombinedLTOObject in lld/ELF/SymbolTable.cpp), as the decisions about comdats should have been made during scan and then honored by LTO.

In https://reviews.llvm.org/D53234#1266418, @tejohnson wrote:

> In https://reviews.llvm.org/D53234#1265891, @Aaron1011 wrote:
>
> > > Normally once they are both internalized and removed from the comdat, it shouldn't matter if only one is kept by GlobalDCE. The other is simply an internal function and no longer part of a comdat (so it's not like the comdat is incomplete).
> >
> > My understanding is that deleting the COMDAT is invalid.
>
>
> That's not my understanding (at least in general for comdats, this may be a special case, see below), but hopefully Peter will clarify, as he's more familiar with the linker aspects.
>
> > The LLVM Language Reference <https://llvm.org/docs/LangRef.html#comdats> doesn't mention anything about removing COMDATS during LTO.
>
> True, although that section relates more to the semantics of comdat selection for things that stay comdats. I'm not sure the language ref needs to cover allowable optimizations.
>
> > From that perspective, the dangling !associated metadata is simply a consequence of the underlying issue of the COMDAT removal.
>
> Reading through https://llvm.org/docs/LangRef.html#associated-metadata makes me think that comdats involving associated metadata need to be handled a little differently, since the object with the metadata should not be removed unless the object it references is removed, and the language reference specifically says they should be in the same comdat.
>
> So perhaps this code needs to handle comdat containing an object with associated metadata specially (e.g. could just include in the ExternalComdats set)?


I don't think that this should be necessary. My understanding is that GlobalDCE does not understand (by design) the gc semantics implied by !associated therefore globals with such metadata are required to be added to llvm.compiler.used to anchor them. Presumably, such globals will contain references to the "associated" global which will also anchor those globals. Eventually these will be gc'ed by the system linker. Given that, I can't see the problem that you are trying to solve. Can you post the output of the bfd linker, and a testcase which reproduces the failure with bfd?


https://reviews.llvm.org/D53234





More information about the llvm-commits mailing list