[llvm-dev] [LLD] Linking static library does not resolve symbols as gold/ld

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 24 21:40:25 PDT 2017


Hi Martin,

Thank you for sending the script. I can reproduce the issue with it. It
looks like the program crashes when it tries to call
std::vector<sometype>'s ctor from a static initializer. I don't fully
understand what is causing the issue yet, but here are my observations.

  - Since you are creating a temporary object file using `ld.gold -r`, your
object file contains multiple weak definitions with the same name, as two
or more input files for `ld.gold -r` contains the same template
instantiations. This is not immediately an error, and LLD should pick one
of them for each unique name, but this might not be workingw ell.

 - If you create a temporary object file using `ld.lld -r`, it should work.
I don't know why, though.

I'll continue investigating.

On Sat, Apr 15, 2017 at 3:10 PM, Martin Richtarsky <s at martinien.de> wrote:

> Hi Rui,
>
> I finally managed to come up with a reduced example, please find it
> attached. You need to have GOLDPATH and LLDPATH set to point to the
> respective linkers.
>
> What happens in build.sh is that an object file is partially linked ("-u")
> with gold first, then this is linked with lld to another object file for
> the final executable. The resulting executable 'repro' then crashes during
> static initialization.
>
> The following changes make it work:
> 1) Using ld instead of gold for the first step
> 2) Using ld or gold for the second step
>
> 2) makes me think there must be something those linkers are doing, but lld
> is not, that makes the whole thing work. But note that the crash happens
> in a constructor. I found this for the "-u" option in the ld manpage here:
>
> https://linux.die.net/man/1/ld
>
> "When linking C++ programs, this option will not resolve references to
> constructors; to do that, use -Ur."
>
> However, gold does not know that option (and ld already works without it)
>
> Any idea what is going wrong here?
>
> Thanks and best regards
> Martin
>
> > Hi Martin,
> >
> > It's hard to tell what is wrong only with the information. If that is an
> > open-source program, can you give me a link to that so that I can try? If
> > that's a proprietary software you cannot share with me, you might want to
> > produce small reproducible test case.
> >
> > On Thu, Mar 23, 2017 at 1:10 AM, Martin Richtarsky <s at martinien.de>
> wrote:
> >
> >> Hi Rui,
> >>
> >> fyi I'm still working on a reproducer I can share.
> >>
> >> >> Here is the relevant output:
> >> >>
> >> >> 0000000000013832 <func()>:
> >> >>    13832:       55                      push   %rbp
> >> >>    13833:       48 89 e5                mov    %rsp,%rbp
> >> >>    13836:       53                      push   %rbx
> >> >>    13837:       48 83 ec 18             sub    $0x18,%rsp
> >> >>    1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
> >> >>    1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
> >> >>    13843:       48 89 c7                mov    %rax,%rdi
> >> >>    13846:       e8 00 00 00 00          callq  1384b <func()+0x19>
> >> >>                         13847: R_X86_64_PLT32   std::vector<record,
> >> >> std::allocator<record> >::vector()-0x4
> >> >>    ....
> >> >>
> >> >
> >> > This seems a bit odd. You have type `record` and instantiate
> >> std::vector
> >> > with `record`. Usually the instantiated template function is in the
> >> same
> >> > compilation unit, and the relocation type is R_X86_64_PC32, not
> >> > R_X86_64_PLT32.
> >>
> >> It seems to me R_X86_64_PLT32 is not so unusual in this case, e.g. -fPIC
> >> already produces this relocation:
> >>
> >> $ cat example.cpp
> >> #include <vector>
> >> #include <string>
> >>
> >> class PropertyReader
> >> {
> >> public:
> >>     struct record
> >>     {
> >>       std::string a;
> >>       std::string b;
> >>     };
> >>     PropertyReader();
> >> private:
> >>     std::vector<record> records;
> >> };
> >>
> >> PropertyReader::PropertyReader() : records()
> >> {
> >> }
> >>
> >> $ g++ -fPIC -c example.cpp -o example.o
> >> $ objdump -d -r -C example.o
> >> ...
> >> 0000000000000000 <PropertyReader::PropertyReader()>:
> >>    0:   55                      push   %rbp
> >>    1:   48 89 e5                mov    %rsp,%rbp
> >>    4:   48 83 ec 10             sub    $0x10,%rsp
> >>    8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
> >>    c:   48 8b 45 f8             mov    -0x8(%rbp),%rax
> >>   10:   48 89 c7                mov    %rax,%rdi
> >>   13:   e8 00 00 00 00          callq  18
> >> <PropertyReader::PropertyReader()+0x18>
> >>                         14: R_X86_64_PLT32
> >> std::vector<PropertyReader::record,
> >> std::allocator<PropertyReader::record>
> >> >::vector()-0x4
> >>   18:   90                      nop
> >>   19:   c9                      leaveq
> >>   1a:   c3                      retq
> >> ...
> >>
> >> But linking such an object file with lld does not produce the original
> >> error so something else is going on.
> >>
> >> > Let me know if more is needed.
> >> >>
> >> >> I recall that this object file is created in a bit unusual way,
> >> >> something
> >> >> like partially linking several other object files together into this
> >> >> one,
> >> >> but I will have to dig deeper to say for sure.
> >> >>
> >> >
> >> > Yes, it looks like the object file is created in an unusual way, and
> >> that
> >> > revealed a subtle difference between ld.gold and ld.lld. I want to
> >> know
> >> > more about that.
> >> >
> >> >
> >> >> Best regards
> >> >> Martin
> >> >>
> >> >> Rui Ueyama wrote:
> >> >> > Compilers don't know about functions that are not defined in the
> >> same
> >> >> > compilation unit, so they leave call instruction operands as zero
> >> >> (because
> >> >> > they can't compute any absolute nor relative address of the
> >> >> destinations),
> >> >> > and let linkers fix the address by binary patching.
> >> >> >
> >> >> > So, what you are seeing is likely a bug of LLD that it fails to fix
> >> >> the
> >> >> > address for some reason.
> >> >> >
> >> >> > Can you dump that function with `objdump -d -r that-file.o`? With
> >> the
> >> >> -r
> >> >> > option, objdump prints out relocation records. Relocation records
> >> are
> >> >> the
> >> >> > information that linkers use to fix addresses.
> >> >> >
> >> >> > On Wed, Mar 15, 2017 at 9:25 AM, Martin Richtarsky <s at martinien.de
> >
> >> >> wrote:
> >> >> >
> >> >> >> Hi all,
> >> >> >>
> >> >> >> I'm currently trying out lld on a large project. We are currently
> >> >> using
> >> >> >> gold (and used GNU ld before that).
> >> >> >>
> >> >> >> I have come across a few minor issues but could workaround them:
> >> >> >> - Missing support for --defsym=symbol1=symbol2,
> >> >> >> --warn-unknown-eh-frame-section, --exclude-libs
> >> >> >>
> >> >> >> There are two other issues which are more critical, one of which
> >> is
> >> >> >> currently blocking me, so I would like to find a solution for this
> >> >> one
> >> >> >> first.
> >> >> >>
> >> >> >> I have a static library that is linked into an executable. The
> >> binary
> >> >> >> produced by lld crashes, while the gold version runs fine.
> >> >> >>
> >> >> >> The difference is in the call instructions below. The original
> >> object
> >> >> >> file
> >> >> >> from the archive has an address of zero in the call instruction:
> >> >> >>
> >> >> >> 0000000000013832 <func>:
> >> >> >>    13832:       55                      push   %rbp
> >> >> >>    13833:       48 89 e5                mov    %rsp,%rbp
> >> >> >>    13836:       53                      push   %rbx
> >> >> >>    13837:       48 83 ec 18             sub    $0x18,%rsp
> >> >> >>    1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
> >> >> >>    1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
> >> >> >>    13843:       48 89 c7                mov    %rax,%rdi
> >> >> >> -> 13846:       e8 00 00 00 00          callq  1384b <func+0x19>
> >> >> >>    1384b:       48 8b 45 e8             mov    -0x18(%rbp),%rax
> >> >> >>
> >> >> >> gdb displays this as a jump to the next instruction:
> >> >> >>
> >> >> >>    0x0000000000013832 <+0>:     push   %rbp
> >> >> >>    0x0000000000013833 <+1>:     mov    %rsp,%rbp
> >> >> >>    0x0000000000013836 <+4>:     push   %rbx
> >> >> >>    0x0000000000013837 <+5>:     sub    $0x18,%rsp
> >> >> >>    0x000000000001383b <+9>:     mov    %rdi,-0x18(%rbp)
> >> >> >>    0x000000000001383f <+13>:    mov    -0x18(%rbp),%rax
> >> >> >>    0x0000000000013843 <+17>:    mov    %rax,%rdi
> >> >> >>    0x0000000000013846 <+20>:    callq  0x1384b <func()+25>
> >> >> >>    0x000000000001384b <+25>:    mov    -0x18(%rbp),%rax
> >> >> >>
> >> >> >> However, in the executable linked by gold, the calls are magically
> >> >> >> resolved:
> >> >> >>
> >> >> >>    0x000000000018b44e <+0>:     push   %rbp
> >> >> >>    0x000000000018b44f <+1>:     mov    %rsp,%rbp
> >> >> >>    0x000000000018b452 <+4>:     push   %rbx
> >> >> >>    0x000000000018b453 <+5>:     sub    $0x18,%rsp
> >> >> >>    0x000000000018b457 <+9>:     mov    %rdi,-0x18(%rbp)
> >> >> >>    0x000000000018b45b <+13>:    mov    -0x18(%rbp),%rax
> >> >> >>    0x000000000018b45f <+17>:    mov    %rax,%rdi
> >> >> >>    0x000000000018b462 <+20>:    callq  0x68568c
> >> <std::vector<record,
> >> >> >> std::allocator<record> >::vector()>
> >> >> >>    0x000000000018b467 <+25>:    mov    -0x18(%rbp),%rax
> >> >> >>
> >> >> >> Even more interesting, several such call instructions with
> >> argument 0
> >> >> >> are
> >> >> >> resolved to different functions. So somewhere there must be
> >> >> information
> >> >> >> stored to what functions they resolve to.
> >> >> >>
> >> >> >> lld produces this code:
> >> >> >>
> >> >> >>    0x00005555559f304e <+0>:     push   %rbp
> >> >> >>    0x00005555559f304f <+1>:     mov    %rsp,%rbp
> >> >> >>    0x00005555559f3052 <+4>:     push   %rbx
> >> >> >>    0x00005555559f3053 <+5>:     sub    $0x18,%rsp
> >> >> >>    0x00005555559f3057 <+9>:     mov    %rdi,-0x18(%rbp)
> >> >> >>    0x00005555559f305b <+13>:    mov    -0x18(%rbp),%rax
> >> >> >>    0x00005555559f305f <+17>:    mov    %rax,%rdi
> >> >> >>    0x00005555559f3062 <+20>:    callq  0x555555554000
> >> >> >>    0x00005555559f3067 <+25>:    mov    -0x18(%rbp),%rax
> >> >> >>
> >> >> >> 0x555555554000 is the start of the mapped region of the
> >> executable,
> >> >> so
> >> >> >> it
> >> >> >> seems lld just adds the argument 0 to that without doing any
> >> >> relocation
> >> >> >> processing.
> >> >> >>
> >> >> >> Is this a known limitation of lld?
> >> >> >>
> >> >> >> Thanks and best regards,
> >> >> >> Martin
> >> >> >>
> >> >> > _______________________________________________
> >> >> > LLVM Developers mailing list
> >> >> > llvm-dev at lists.llvm.org
> >> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >>
> >> >>
> >> > _______________________________________________
> >> > LLVM Developers mailing list
> >> > llvm-dev at lists.llvm.org
> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >
> >>
> >>
> >> --
> >> http://www.martinien.de/
> >>
> >>
> >>
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170424/9cc94243/attachment.html>


More information about the llvm-dev mailing list