[llvm-dev] [LLD] Linking static library does not resolve symbols as gold/ld

Martin Richtarsky via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 23 01:10:29 PDT 2017


Hi Rui,

fyi I'm still working on a reproducer I can share.

>> Here is the relevant output:
>>
>> 0000000000013832 <func()>:
>>    13832:       55                      push   %rbp
>>    13833:       48 89 e5                mov    %rsp,%rbp
>>    13836:       53                      push   %rbx
>>    13837:       48 83 ec 18             sub    $0x18,%rsp
>>    1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
>>    1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>>    13843:       48 89 c7                mov    %rax,%rdi
>>    13846:       e8 00 00 00 00          callq  1384b <func()+0x19>
>>                         13847: R_X86_64_PLT32   std::vector<record,
>> std::allocator<record> >::vector()-0x4
>>    ....
>>
>
> This seems a bit odd. You have type `record` and instantiate std::vector
> with `record`. Usually the instantiated template function is in the same
> compilation unit, and the relocation type is R_X86_64_PC32, not
> R_X86_64_PLT32.

It seems to me R_X86_64_PLT32 is not so unusual in this case, e.g. -fPIC
already produces this relocation:

$ cat example.cpp
#include <vector>
#include <string>

class PropertyReader
{
public:
    struct record
    {
      std::string a;
      std::string b;
    };
    PropertyReader();
private:
    std::vector<record> records;
};

PropertyReader::PropertyReader() : records()
{
}

$ g++ -fPIC -c example.cpp -o example.o
$ objdump -d -r -C example.o
...
0000000000000000 <PropertyReader::PropertyReader()>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   c:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  10:   48 89 c7                mov    %rax,%rdi
  13:   e8 00 00 00 00          callq  18
<PropertyReader::PropertyReader()+0x18>
                        14: R_X86_64_PLT32     
std::vector<PropertyReader::record,
std::allocator<PropertyReader::record>
>::vector()-0x4
  18:   90                      nop
  19:   c9                      leaveq
  1a:   c3                      retq
...

But linking such an object file with lld does not produce the original
error so something else is going on.

> Let me know if more is needed.
>>
>> I recall that this object file is created in a bit unusual way,
>> something
>> like partially linking several other object files together into this
>> one,
>> but I will have to dig deeper to say for sure.
>>
>
> Yes, it looks like the object file is created in an unusual way, and that
> revealed a subtle difference between ld.gold and ld.lld. I want to know
> more about that.
>
>
>> Best regards
>> Martin
>>
>> Rui Ueyama wrote:
>> > Compilers don't know about functions that are not defined in the same
>> > compilation unit, so they leave call instruction operands as zero
>> (because
>> > they can't compute any absolute nor relative address of the
>> destinations),
>> > and let linkers fix the address by binary patching.
>> >
>> > So, what you are seeing is likely a bug of LLD that it fails to fix
>> the
>> > address for some reason.
>> >
>> > Can you dump that function with `objdump -d -r that-file.o`? With the
>> -r
>> > option, objdump prints out relocation records. Relocation records are
>> the
>> > information that linkers use to fix addresses.
>> >
>> > On Wed, Mar 15, 2017 at 9:25 AM, Martin Richtarsky <s at martinien.de>
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I'm currently trying out lld on a large project. We are currently
>> using
>> >> gold (and used GNU ld before that).
>> >>
>> >> I have come across a few minor issues but could workaround them:
>> >> - Missing support for --defsym=symbol1=symbol2,
>> >> --warn-unknown-eh-frame-section, --exclude-libs
>> >>
>> >> There are two other issues which are more critical, one of which is
>> >> currently blocking me, so I would like to find a solution for this
>> one
>> >> first.
>> >>
>> >> I have a static library that is linked into an executable. The binary
>> >> produced by lld crashes, while the gold version runs fine.
>> >>
>> >> The difference is in the call instructions below. The original object
>> >> file
>> >> from the archive has an address of zero in the call instruction:
>> >>
>> >> 0000000000013832 <func>:
>> >>    13832:       55                      push   %rbp
>> >>    13833:       48 89 e5                mov    %rsp,%rbp
>> >>    13836:       53                      push   %rbx
>> >>    13837:       48 83 ec 18             sub    $0x18,%rsp
>> >>    1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
>> >>    1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>> >>    13843:       48 89 c7                mov    %rax,%rdi
>> >> -> 13846:       e8 00 00 00 00          callq  1384b <func+0x19>
>> >>    1384b:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>> >>
>> >> gdb displays this as a jump to the next instruction:
>> >>
>> >>    0x0000000000013832 <+0>:     push   %rbp
>> >>    0x0000000000013833 <+1>:     mov    %rsp,%rbp
>> >>    0x0000000000013836 <+4>:     push   %rbx
>> >>    0x0000000000013837 <+5>:     sub    $0x18,%rsp
>> >>    0x000000000001383b <+9>:     mov    %rdi,-0x18(%rbp)
>> >>    0x000000000001383f <+13>:    mov    -0x18(%rbp),%rax
>> >>    0x0000000000013843 <+17>:    mov    %rax,%rdi
>> >>    0x0000000000013846 <+20>:    callq  0x1384b <func()+25>
>> >>    0x000000000001384b <+25>:    mov    -0x18(%rbp),%rax
>> >>
>> >> However, in the executable linked by gold, the calls are magically
>> >> resolved:
>> >>
>> >>    0x000000000018b44e <+0>:     push   %rbp
>> >>    0x000000000018b44f <+1>:     mov    %rsp,%rbp
>> >>    0x000000000018b452 <+4>:     push   %rbx
>> >>    0x000000000018b453 <+5>:     sub    $0x18,%rsp
>> >>    0x000000000018b457 <+9>:     mov    %rdi,-0x18(%rbp)
>> >>    0x000000000018b45b <+13>:    mov    -0x18(%rbp),%rax
>> >>    0x000000000018b45f <+17>:    mov    %rax,%rdi
>> >>    0x000000000018b462 <+20>:    callq  0x68568c <std::vector<record,
>> >> std::allocator<record> >::vector()>
>> >>    0x000000000018b467 <+25>:    mov    -0x18(%rbp),%rax
>> >>
>> >> Even more interesting, several such call instructions with argument 0
>> >> are
>> >> resolved to different functions. So somewhere there must be
>> information
>> >> stored to what functions they resolve to.
>> >>
>> >> lld produces this code:
>> >>
>> >>    0x00005555559f304e <+0>:     push   %rbp
>> >>    0x00005555559f304f <+1>:     mov    %rsp,%rbp
>> >>    0x00005555559f3052 <+4>:     push   %rbx
>> >>    0x00005555559f3053 <+5>:     sub    $0x18,%rsp
>> >>    0x00005555559f3057 <+9>:     mov    %rdi,-0x18(%rbp)
>> >>    0x00005555559f305b <+13>:    mov    -0x18(%rbp),%rax
>> >>    0x00005555559f305f <+17>:    mov    %rax,%rdi
>> >>    0x00005555559f3062 <+20>:    callq  0x555555554000
>> >>    0x00005555559f3067 <+25>:    mov    -0x18(%rbp),%rax
>> >>
>> >> 0x555555554000 is the start of the mapped region of the executable,
>> so
>> >> it
>> >> seems lld just adds the argument 0 to that without doing any
>> relocation
>> >> processing.
>> >>
>> >> Is this a known limitation of lld?
>> >>
>> >> Thanks and best regards,
>> >> Martin
>> >>
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
http://www.martinien.de/




More information about the llvm-dev mailing list