<div dir="ltr"><div dir="ltr">On Tue, Jun 18, 2019 at 7:46 PM Peter Smith <<a href="mailto:peter.smith@linaro.org">peter.smith@linaro.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, 17 Jun 2019 at 20:44, Eli Friedman <<a href="mailto:efriedma@quicinc.com" target="_blank">efriedma@quicinc.com</a>> wrote:<br>
><br>
> > -----Original Message-----<br>
> > From: Peter Smith <<a href="mailto:peter.smith@linaro.org" target="_blank">peter.smith@linaro.org</a>><br>
> > Sent: Monday, June 17, 2019 3:33 AM<br>
> > To: Eli Friedman <<a href="mailto:efriedma@qualcomm.com" target="_blank">efriedma@qualcomm.com</a>><br>
> > Cc: llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
> > Subject: [EXT] Re: [llvm-dev] lld symbol choice for symbol present in both a<br>
> > shared and a static library, with and without LTO<br>
> ><br>
> > On Fri, 14 Jun 2019 at 20:58, Eli Friedman via llvm-dev<br>
> > <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
> > ><br>
> > ><br>
> > ><br>
> > > If “obj.o” is built with LTO enabled, and the function is specifically a runtime<br>
> > function, the behavior is different. For example, suppose the IR contains a call<br>
> > to “llvm.memcpy”, and the generated code eventually calls “memcpy”. Or<br>
> > suppose the IR contains a “resume” instruction, and the generated code<br>
> > eventually calls “_Unwind_Resume”. In this case, the choice is different: lld<br>
> > always chooses the “memcpy” or “_Unwind_Resume” from the shared library,<br>
> > ignoring the order the files are specified on the command-line. Is this the<br>
> > expected behavior?<br>
> ><br>
> > As I understand it, there is no more selection of members from static<br>
> > libraries after the LTO code-generator has run. In the example from<br>
> > the PR there is no other object with a reference to memcpy so the<br>
> > member containing the static definition is not loaded, leaving only<br>
> > the shared library to match against. I would expect if there were<br>
> > another reference to memcpy from a bitcode file or another ELF file<br>
> > and the static library was before the shared then it would match<br>
> > against that.<br>
> ><br>
> > As to whether this is expected or not, I don't know for certain. One<br>
> > desirable property of not selecting more objects from static libraries<br>
> > is that you are guaranteed not to load any more bitcode files from<br>
> > static libraries, which would either need compiling separately from<br>
> > the other bitcode files, or have the whole compilation done again with<br>
> > the new objects, which could cause more bitcode files to be loaded<br>
> > etc.<br>
><br>
> For runtime functions defined in bitcode, we avoid the "double-LTO" scenario you describe by including them in the LTO link even if we can't prove they will be used. This is the handleLibcall code you pointed out. (<a href="https://github.com/llvm-mirror/lld/blob/master/ELF/Driver.cpp#L1733" rel="noreferrer" target="_blank">https://github.com/llvm-mirror/lld/blob/master/ELF/Driver.cpp#L1733</a>). As the comment there describes, we don't do this for runtime functions which are not defined in bitcode, to avoid other side-effects; instead we resolve those symbols after LTO.<br>
><br>
> For the scenario I'm describing, though, it looks like the key decision here is made in SymbolTable::addShared, before handleLibcall and LTO. If a symbol is defined in both a static library and a shared library, and we haven't seen a reference to the static library's symbol at that point, we throw away the record of the symbol defined in the static library.<br>
><br>
> Ultimately, I guess the question is what alternatives are possible, without breaking the scenarios handleLibcall is supposed to handle. I see a few possibilities here:<br>
><br>
> 1. Whenever we see any bitcode file, treat it as referencing every possible runtime function, even those defined in non-bitcode static libraries. Then we try to resolve the __sync_val_compare_and_swap_8 issue from <a href="https://reviews.llvm.org/D50475" rel="noreferrer" target="_blank">https://reviews.llvm.org/D50475</a> some other way.<br>
<br>
Is it out of the question for the bitcode files to add the set of<br>
libcalls they may potentially call in the bitcode symbol table? If<br>
this were possible then handleLibcall wouldn't be necessary as all the<br>
dependencies would be explicit in the bitcode file symbol table. I can<br>
see this working if LTO only eliminates the libcall, but would not if<br>
the decision between incompatible libcalls was made at LTO time.<br></blockquote><div><br></div><div>I haven't thought of that before, but this might be a good idea. At least it is not out of the question.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> 2. Change the symbol resolution that runs after LTO to use a different symbol resolution rules from normal non-LTO/before-LTO symbol resolution, so it finds the function from the static library instead of the shared library.<br>
<br>
I think this would be tricky with LLD's current implementation as a<br>
lot of the information about what candidate symbols in which library<br>
is lost as part of the merging process. I think it would essentially<br>
be another implementation.<br>
<br>
> 3. Change symbol resolution in general to prefer "lazy" symbols from static libraries over symbols from shared libraries, even outside LTO. So "static.a shared.so object.o" picks the symbol from static.a, instead of shared.so like it does now.<br>
<br>
While there isn't any requirements or specification for how a linker<br>
should do symbol resolution; LLD does seem to match ld.bfd with<br>
--start-group memcpy.a memcpy.so input.o --end-group, the symbol from<br>
memcpy.so is preferred. As Rui points out this would be risky as it<br>
could open up projects to code-size increases to more multiply defined<br>
symbol errors.<br>
<br>
> 4. We WONTFIX <a href="https://bugs.llvm.org/show_bug.cgi?id=42273" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=42273</a> .<br>
><br>
<br>
I guess this depends on to what extent this is a problem. If it is a<br>
small number of programs affected then it can probably be resolved by<br>
adding an ELF file placed at the start of the command line with<br>
undefined references to the specific ELF libcall symbols. If it is a<br>
serious problem for almost everyone using LTO then it might be worth<br>
an alternative library scan code in LLD to handle it.<br>
<br>
Peter<br>
<br>
> -Eli<br>
</blockquote></div></div>