[llvm-dev] [LLD] Linking static library does not resolve symbols as gold/ld

Martin Richtarsky via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 15 14:22:28 PDT 2017


Here is the relevant output:

0000000000013832 <func()>:
   13832:       55                      push   %rbp
   13833:       48 89 e5                mov    %rsp,%rbp
   13836:       53                      push   %rbx
   13837:       48 83 ec 18             sub    $0x18,%rsp
   1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
   1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
   13843:       48 89 c7                mov    %rax,%rdi
   13846:       e8 00 00 00 00          callq  1384b <func()+0x19>
                        13847: R_X86_64_PLT32   std::vector<record,
std::allocator<record> >::vector()-0x4
   ....

Let me know if more is needed.

I recall that this object file is created in a bit unusual way, something
like partially linking several other object files together into this one,
but I will have to dig deeper to say for sure.

Best regards
Martin

Rui Ueyama wrote:
> Compilers don't know about functions that are not defined in the same
> compilation unit, so they leave call instruction operands as zero (because
> they can't compute any absolute nor relative address of the destinations),
> and let linkers fix the address by binary patching.
>
> So, what you are seeing is likely a bug of LLD that it fails to fix the
> address for some reason.
>
> Can you dump that function with `objdump -d -r that-file.o`? With the -r
> option, objdump prints out relocation records. Relocation records are the
> information that linkers use to fix addresses.
>
> On Wed, Mar 15, 2017 at 9:25 AM, Martin Richtarsky <s at martinien.de> wrote:
>
>> Hi all,
>>
>> I'm currently trying out lld on a large project. We are currently using
>> gold (and used GNU ld before that).
>>
>> I have come across a few minor issues but could workaround them:
>> - Missing support for --defsym=symbol1=symbol2,
>> --warn-unknown-eh-frame-section, --exclude-libs
>>
>> There are two other issues which are more critical, one of which is
>> currently blocking me, so I would like to find a solution for this one
>> first.
>>
>> I have a static library that is linked into an executable. The binary
>> produced by lld crashes, while the gold version runs fine.
>>
>> The difference is in the call instructions below. The original object
>> file
>> from the archive has an address of zero in the call instruction:
>>
>> 0000000000013832 <func>:
>>    13832:       55                      push   %rbp
>>    13833:       48 89 e5                mov    %rsp,%rbp
>>    13836:       53                      push   %rbx
>>    13837:       48 83 ec 18             sub    $0x18,%rsp
>>    1383b:       48 89 7d e8             mov    %rdi,-0x18(%rbp)
>>    1383f:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>>    13843:       48 89 c7                mov    %rax,%rdi
>> -> 13846:       e8 00 00 00 00          callq  1384b <func+0x19>
>>    1384b:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>>
>> gdb displays this as a jump to the next instruction:
>>
>>    0x0000000000013832 <+0>:     push   %rbp
>>    0x0000000000013833 <+1>:     mov    %rsp,%rbp
>>    0x0000000000013836 <+4>:     push   %rbx
>>    0x0000000000013837 <+5>:     sub    $0x18,%rsp
>>    0x000000000001383b <+9>:     mov    %rdi,-0x18(%rbp)
>>    0x000000000001383f <+13>:    mov    -0x18(%rbp),%rax
>>    0x0000000000013843 <+17>:    mov    %rax,%rdi
>>    0x0000000000013846 <+20>:    callq  0x1384b <func()+25>
>>    0x000000000001384b <+25>:    mov    -0x18(%rbp),%rax
>>
>> However, in the executable linked by gold, the calls are magically
>> resolved:
>>
>>    0x000000000018b44e <+0>:     push   %rbp
>>    0x000000000018b44f <+1>:     mov    %rsp,%rbp
>>    0x000000000018b452 <+4>:     push   %rbx
>>    0x000000000018b453 <+5>:     sub    $0x18,%rsp
>>    0x000000000018b457 <+9>:     mov    %rdi,-0x18(%rbp)
>>    0x000000000018b45b <+13>:    mov    -0x18(%rbp),%rax
>>    0x000000000018b45f <+17>:    mov    %rax,%rdi
>>    0x000000000018b462 <+20>:    callq  0x68568c <std::vector<record,
>> std::allocator<record> >::vector()>
>>    0x000000000018b467 <+25>:    mov    -0x18(%rbp),%rax
>>
>> Even more interesting, several such call instructions with argument 0
>> are
>> resolved to different functions. So somewhere there must be information
>> stored to what functions they resolve to.
>>
>> lld produces this code:
>>
>>    0x00005555559f304e <+0>:     push   %rbp
>>    0x00005555559f304f <+1>:     mov    %rsp,%rbp
>>    0x00005555559f3052 <+4>:     push   %rbx
>>    0x00005555559f3053 <+5>:     sub    $0x18,%rsp
>>    0x00005555559f3057 <+9>:     mov    %rdi,-0x18(%rbp)
>>    0x00005555559f305b <+13>:    mov    -0x18(%rbp),%rax
>>    0x00005555559f305f <+17>:    mov    %rax,%rdi
>>    0x00005555559f3062 <+20>:    callq  0x555555554000
>>    0x00005555559f3067 <+25>:    mov    -0x18(%rbp),%rax
>>
>> 0x555555554000 is the start of the mapped region of the executable, so
>> it
>> seems lld just adds the argument 0 to that without doing any relocation
>> processing.
>>
>> Is this a known limitation of lld?
>>
>> Thanks and best regards,
>> Martin
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



More information about the llvm-dev mailing list