[PATCH] D68101: [MC][ELF] Prevent globals with an explicit section from being mergeable

John McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 3 14:27:07 PST 2019


rjmccall added a comment.

In D68101#1767732 <https://reviews.llvm.org/D68101#1767732>, @nickdesaulniers wrote:

> > But ELF allows object files to contain an arbitrary number of what I've been calling "section units" that will be assembled into a single section in the image.
>
> More precisely, in assembler, you can specify sections dis-jointly, but they will be rejoined when assembled into an ELF object, as ELF section names are unique and cannot be discontinuous.


The ELF spec is explicit that files can contain multiple sections with the same name.  This is necessary when working with groups but isn't restricted to that; for example, LLVM will also emit multiple sections for a single name+group pair when it has an associated section.  The linker may join these in the image after it's done all the necessary processing, but I don't think it's actually required to.

As for what ELF assemblers actually support, that's of course a separate story.  They presumably produce unique sections by at least name+group pairs, or else COMDATs will be totally broken.  I don't know how the handling for associated sections works when LLVM emits to assembly vs. just producing the object file directly.

It's not abstractly unreasonable to also emit different sections based on mergeability and entry size.  However, if doing so breaks current tools, that's not a reasonable option.  The next best option would be to emit a single section per name+group, but only flag it mergeable if all the objects are identically mergeable.  Unfortunately, I think MC isn't architected for that; the assembly writer wants to process globals one-by-one and can't retroactively change flags.  So the final option is to stop trying to use ELF mergability for unnamed_addr globals entirely, which I think we all agree is undesirable.

I don't think any sort of frontend-based solution is reasonable.  IR should be able to freely mark globals with sections and unnamed_addr, and it's the backend's responsibility to emit the best code it can for what's been written.  If the current state of the rest of the toolchain means we can't pursue some optimization, so be it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68101/new/

https://reviews.llvm.org/D68101





More information about the llvm-commits mailing list