[PATCH] D120626: [ELF] Move section assignment from initializeSymbols to postParse

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 2 02:30:33 PST 2022


peter.smith added a comment.

In D120626#3353151 <https://reviews.llvm.org/D120626#3353151>, @MaskRay wrote:

> In D120626#3352236 <https://reviews.llvm.org/D120626#3352236>, @peter.smith wrote:
>
>> Not supporting `comdat-binding: %t/w.o %t/g.o leads to an undesired undefined symbol. This is not ideal but acceptable because valid input cannot form this case.` makes me a bit nervous. In a clang/gcc world this is not a problem, but there may be third party compilers that differ. For example arm's legacy compiler armcc from Arm Compiler 5 uses STB_GLOBAL for definitions in COMDAT groups, whereas GCC, at least used to, uses STB_WEAK. It is possible that this is illegal, but I can't find in the spec where it says that.
>>
>> Perhaps it could be accounted for in resolve as I think we'd be passing a global with the discarded section. Perhaps we could prefer the existing symbol in that case?
>
> Avoiding accessing `sections[*]` is what the patch strives to do: untangle section initialization and symbol initialization/resolution as much as possible.
> I have a plan to merge section initialization and `initializeLocalSymbols` into `initSectionsAndLocalSyms`, which parallelizes section initialization.
> In the long term, this is a necessary step to achieve parallel input file parsing (https://discourse.llvm.org/t/parallel-input-file-parsing/60164) (and perhaps important to keep lld relevant with the emerging of mold.)
>
> Keeping `if (sec == &InputSection::discarded) ` will address the `%t/w.o %t/g.o` dilemma, but that probably will not fly.
> Personally I'd like to say such a dilemma is unsupported. If we have to, it doesn't seem like there is a good way supporting it...

Thanks for the updates so far.

My thoughts so far:

- I agree that mixing symbol bindings in the same link is a niche use case, likely for compilers not derived from LLVM or GCC.
- I expect that parallel reading will be in development for some time. It probably isn't worth working extra hard right now.
- If it isn't possible to cleanly support it, then can we diagnose it with a specific error message? For example `LLD does not support COMDAT groups where different groups use different symbol bindings for the same symbol name.` is better than an undefined symbol error.
- We'll need to document and release note the restriction.



================
Comment at: lld/ELF/InputFiles.cpp:1162
+      if (sym.traced) {
+        printTraceSymbol(Undefined{this, sym.getName(), sym.binding,
+                                   sym.stOther, sym.type, secIdx},
----------------
MaskRay wrote:
> peter.smith wrote:
> > If this is happening in parallel, is there a chance of interleaved output? For example a traced symbol in multiple groups? I guess that would need something like driver->duplicates?
> > 
> Yes. If the determinism is desired, `elf::printTraceSymbol(` needs to some refactoring: the `message` call needs to be replaced with something like `driver->duplicates`.
> 
> Placing `duplicates` in `LinkerDriver` is a bit arbitrary. I wonder whether we should use `ctx->duplicates`.
By interleaved output I was thinking of something along the lines of:
```
Some people, when confronted with a problem, think, “I know, I’ll use threads,” and then two they hav erpoblesms.
```
from https://nedbatchelder.com/blog/201204/two_problems.html 

Perhaps another mutex for the block if it is susceptible. It is possible that the stream is protected with a lock though.

ctx->duplicates is probably more friendly to the people that want to use LLD in the context of a library.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120626/new/

https://reviews.llvm.org/D120626



More information about the llvm-commits mailing list