[llvm-dev] lld symbol choice for symbol present in both a shared and a static library, with and without LTO

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 18 06:01:11 PDT 2019


On Tue, Jun 18, 2019 at 7:46 PM Peter Smith <peter.smith at linaro.org> wrote:

> On Mon, 17 Jun 2019 at 20:44, Eli Friedman <efriedma at quicinc.com> wrote:
> >
> > > -----Original Message-----
> > > From: Peter Smith <peter.smith at linaro.org>
> > > Sent: Monday, June 17, 2019 3:33 AM
> > > To: Eli Friedman <efriedma at qualcomm.com>
> > > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > > Subject: [EXT] Re: [llvm-dev] lld symbol choice for symbol present in
> both a
> > > shared and a static library, with and without LTO
> > >
> > > On Fri, 14 Jun 2019 at 20:58, Eli Friedman via llvm-dev
> > > <llvm-dev at lists.llvm.org> wrote:
> > > >
> > > >
> > > >
> > > > If “obj.o” is built with LTO enabled, and the function is
> specifically a runtime
> > > function, the behavior is different.  For example, suppose the IR
> contains a call
> > > to “llvm.memcpy”, and the generated code eventually calls “memcpy”.  Or
> > > suppose the IR contains a “resume” instruction, and the generated code
> > > eventually calls “_Unwind_Resume”.  In this case, the choice is
> different: lld
> > > always chooses the “memcpy” or “_Unwind_Resume” from the shared
> library,
> > > ignoring the order the files are specified on the command-line.  Is
> this the
> > > expected behavior?
> > >
> > > As I understand it, there is no more selection of members from static
> > > libraries after the LTO code-generator has run. In the example from
> > > the PR there is no other object with a reference to memcpy so the
> > > member containing the static definition is not loaded, leaving only
> > > the shared library to match against. I would expect if there were
> > > another reference to memcpy from a bitcode file or another ELF file
> > > and the static library was before the shared then it would match
> > > against that.
> > >
> > > As to whether this is expected or not, I don't know for certain. One
> > > desirable property of not selecting more objects from static libraries
> > > is that you are guaranteed not to load any more bitcode files from
> > > static libraries, which would either need compiling separately from
> > > the other bitcode files, or have the whole compilation done again with
> > > the new objects, which could cause more bitcode files to be loaded
> > > etc.
> >
> > For runtime functions defined in bitcode, we avoid the "double-LTO"
> scenario you describe by including them in the LTO link even if we can't
> prove they will be used.  This is the handleLibcall code you pointed out. (
> https://github.com/llvm-mirror/lld/blob/master/ELF/Driver.cpp#L1733).  As
> the comment there describes, we don't do this for runtime functions which
> are not defined in bitcode, to avoid other side-effects; instead we resolve
> those symbols after LTO.
> >
> > For the scenario I'm describing, though, it looks like the key decision
> here is made in SymbolTable::addShared, before handleLibcall and LTO.  If a
> symbol is defined in both a static library and a shared library, and we
> haven't seen a reference to the static library's symbol at that point, we
> throw away the record of the symbol defined in the static library.
> >
> > Ultimately, I guess the question is what alternatives are possible,
> without breaking the scenarios handleLibcall is supposed to handle.  I see
> a few possibilities here:
> >
> > 1. Whenever we see any bitcode file, treat it as referencing every
> possible runtime function, even those defined in non-bitcode static
> libraries.  Then we try to resolve the __sync_val_compare_and_swap_8 issue
> from https://reviews.llvm.org/D50475 some other way.
>
> Is it out of the question for the bitcode files to add the set of
> libcalls they may potentially call in the bitcode symbol table? If
> this were possible then handleLibcall wouldn't be necessary as all the
> dependencies would be explicit in the bitcode file symbol table. I can
> see this working if LTO only eliminates the libcall, but would not if
> the decision between incompatible libcalls was made at LTO time.
>

I haven't thought of that before, but this might be a good idea. At least
it is not out of the question.

> 2. Change the symbol resolution that runs after LTO to use a different
> symbol resolution rules from normal non-LTO/before-LTO symbol resolution,
> so it finds the function from the static library instead of the shared
> library.
>
> I think this would be tricky with LLD's current implementation as a
> lot of the information about what candidate symbols in which library
> is lost as part of the merging process. I think it would essentially
> be another implementation.
>
> > 3. Change symbol resolution in general to prefer "lazy" symbols from
> static libraries over symbols from shared libraries, even outside LTO.  So
> "static.a shared.so object.o" picks the symbol from static.a, instead of
> shared.so like it does now.
>
> While there isn't any requirements or specification for how a linker
> should do symbol resolution; LLD does seem to match ld.bfd with
> --start-group memcpy.a memcpy.so input.o --end-group, the symbol from
> memcpy.so is preferred. As Rui points out this would be risky as it
> could open up projects to code-size increases to more multiply defined
> symbol errors.
>
> > 4. We WONTFIX https://bugs.llvm.org/show_bug.cgi?id=42273 .
> >
>
> I guess this depends on to what extent this is a problem. If it is a
> small number of programs affected then it can probably be resolved by
> adding an ELF file placed at the start of the command line with
> undefined references to the specific ELF libcall symbols. If it is a
> serious problem for almost everyone using LTO then it might be worth
> an alternative library scan code in LLD to handle it.
>
> Peter
>
> > -Eli
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190618/a94f3a66/attachment.html>


More information about the llvm-dev mailing list