[llvm-dev] lld symbol choice for symbol present in both a shared and a static library, with and without LTO

Teresa Johnson via llvm-dev llvm-dev at lists.llvm.org
Mon Jun 17 06:36:38 PDT 2019


On Mon, Jun 17, 2019 at 4:15 AM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On Sat, Jun 15, 2019 at 4:58 AM Eli Friedman via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I filed https://bugs.llvm.org/show_bug.cgi?id=42273 last night, about an
>> inconsistency between LTO and non-LTO workflows.
>>
>>
>>
>> The basic scenario is that we have an object file which calls a function
>> “foo”, a static library that provides an implementation of “foo”, and a
>> shared library that also provides an implementation of “foo”.  Currently,
>> whether lld chooses the symbol from the static library or the shared
>> library depends on the order the files are specified on the command-line.
>> For “obj.o static.a shared.so”, or “static.a obj.o shared.so”, lld chooses
>> the symbol from the static library. For any other order, it chooses the
>> symbol from the shared library.  Is this the expected behavior?  (As far as
>> I can tell, this matches binutils ld except for the “static.a obj.o
>> shared.so” case.)
>>
>
> This is what I expected. When lld visits an object file A and find an
> undefined symbol, and there's a file B that appears before the object file
> in the command line that defines the symbol, then B gets linked. If there's
> more than one file that define the symbol, the leftmost one is chosen.
>
> If “obj.o” is built with LTO enabled, and the function is specifically a
>> runtime function, the behavior is different.  For example, suppose the IR
>> contains a call to “llvm.memcpy”, and the generated code eventually calls
>> “memcpy”.  Or suppose the IR contains a “resume” instruction, and the
>> generated code eventually calls “_Unwind_Resume”.  In this case, the choice
>> is different: lld always chooses the “memcpy” or “_Unwind_Resume” from the
>> shared library, ignoring the order the files are specified on the
>> command-line.  Is this the expected behavior?
>>
>
> That's not expected, but I suspect that that only occurs when you use a
> builtin function like memcpy. Does this happen when you define some random
> function like "foo"?
>

I believe this is going to be specific to builtin functions. The reason is
that the LTO link is fed by bitcode files, which at this point have
references to the llvm intrinsic, not the library call. So the linker,
which invokes the LTO compilation and provides the symbol resolutions, does
not see any call to e.g. "memcpy". Later, in the LTO backends, the
intrinsic gets turned into something, depending on the compiler's
heuristics. This something could be an inline expansion of memcpy, or a
regular call to memcpy.

For these libcalls, to avoid this behavior build with -fno-builtin-memcpy
(or other libcall name), or more generally, -fno-builtin or -ffreestanding
to block them all.

Teresa

_______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190617/75b98f9f/attachment.html>


More information about the llvm-dev mailing list