<div dir="ltr">Hi Martin,<div><br></div><div>Thank you for sending the script. I can reproduce the issue with it. It looks like the program crashes when it tries to call std::vector<sometype>'s ctor from a static initializer. I don't fully understand what is causing the issue yet, but here are my observations.</div><div><br></div><div> - Since you are creating a temporary object file using `ld.gold -r`, your object file contains multiple weak definitions with the same name, as two or more input files for `ld.gold -r` contains the same template instantiations. This is not immediately an error, and LLD should pick one of them for each unique name, but this might not be workingw ell.</div><div><br></div><div> - If you create a temporary object file using `ld.lld -r`, it should work. I don't know why, though.</div><div><br></div><div>I'll continue investigating.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Apr 15, 2017 at 3:10 PM, Martin Richtarsky <span dir="ltr"><<a href="mailto:s@martinien.de" target="_blank">s@martinien.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Rui,<br>
<br>
I finally managed to come up with a reduced example, please find it<br>
attached. You need to have GOLDPATH and LLDPATH set to point to the<br>
respective linkers.<br>
<br>
What happens in build.sh is that an object file is partially linked ("-u")<br>
with gold first, then this is linked with lld to another object file for<br>
the final executable. The resulting executable 'repro' then crashes during<br>
static initialization.<br>
<br>
The following changes make it work:<br>
1) Using ld instead of gold for the first step<br>
2) Using ld or gold for the second step<br>
<br>
2) makes me think there must be something those linkers are doing, but lld<br>
is not, that makes the whole thing work. But note that the crash happens<br>
in a constructor. I found this for the "-u" option in the ld manpage here:<br>
<br>
<a href="https://linux.die.net/man/1/ld" rel="noreferrer" target="_blank">https://linux.die.net/man/1/ld</a><br>
<br>
"When linking C++ programs, this option will not resolve references to<br>
constructors; to do that, use -Ur."<br>
<br>
However, gold does not know that option (and ld already works without it)<br>
<br>
Any idea what is going wrong here?<br>
<br>
Thanks and best regards<br>
<span class="HOEnZb"><font color="#888888">Martin<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> Hi Martin,<br>
><br>
> It's hard to tell what is wrong only with the information. If that is an<br>
> open-source program, can you give me a link to that so that I can try? If<br>
> that's a proprietary software you cannot share with me, you might want to<br>
> produce small reproducible test case.<br>
><br>
> On Thu, Mar 23, 2017 at 1:10 AM, Martin Richtarsky <<a href="mailto:s@martinien.de">s@martinien.de</a>> wrote:<br>
><br>
>> Hi Rui,<br>
>><br>
>> fyi I'm still working on a reproducer I can share.<br>
>><br>
>> >> Here is the relevant output:<br>
>> >><br>
>> >> 0000000000013832 <func()>:<br>
>> >> 13832: 55 push %rbp<br>
>> >> 13833: 48 89 e5 mov %rsp,%rbp<br>
>> >> 13836: 53 push %rbx<br>
>> >> 13837: 48 83 ec 18 sub $0x18,%rsp<br>
>> >> 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp)<br>
>> >> 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax<br>
>> >> 13843: 48 89 c7 mov %rax,%rdi<br>
>> >> 13846: e8 00 00 00 00 callq 1384b <func()+0x19><br>
>> >> 13847: R_X86_64_PLT32 std::vector<record,<br>
>> >> std::allocator<record> >::vector()-0x4<br>
>> >> ....<br>
>> >><br>
>> ><br>
>> > This seems a bit odd. You have type `record` and instantiate<br>
>> std::vector<br>
>> > with `record`. Usually the instantiated template function is in the<br>
>> same<br>
>> > compilation unit, and the relocation type is R_X86_64_PC32, not<br>
>> > R_X86_64_PLT32.<br>
>><br>
>> It seems to me R_X86_64_PLT32 is not so unusual in this case, e.g. -fPIC<br>
>> already produces this relocation:<br>
>><br>
>> $ cat example.cpp<br>
>> #include <vector><br>
>> #include <string><br>
>><br>
>> class PropertyReader<br>
>> {<br>
>> public:<br>
>> struct record<br>
>> {<br>
>> std::string a;<br>
>> std::string b;<br>
>> };<br>
>> PropertyReader();<br>
>> private:<br>
>> std::vector<record> records;<br>
>> };<br>
>><br>
>> PropertyReader::<wbr>PropertyReader() : records()<br>
>> {<br>
>> }<br>
>><br>
>> $ g++ -fPIC -c example.cpp -o example.o<br>
>> $ objdump -d -r -C example.o<br>
>> ...<br>
>> 0000000000000000 <PropertyReader::<wbr>PropertyReader()>:<br>
>> 0: 55 push %rbp<br>
>> 1: 48 89 e5 mov %rsp,%rbp<br>
>> 4: 48 83 ec 10 sub $0x10,%rsp<br>
>> 8: 48 89 7d f8 mov %rdi,-0x8(%rbp)<br>
>> c: 48 8b 45 f8 mov -0x8(%rbp),%rax<br>
>> 10: 48 89 c7 mov %rax,%rdi<br>
>> 13: e8 00 00 00 00 callq 18<br>
>> <PropertyReader::<wbr>PropertyReader()+0x18><br>
>> 14: R_X86_64_PLT32<br>
>> std::vector<PropertyReader::<wbr>record,<br>
>> std::allocator<PropertyReader:<wbr>:record><br>
>> >::vector()-0x4<br>
>> 18: 90 nop<br>
>> 19: c9 leaveq<br>
>> 1a: c3 retq<br>
>> ...<br>
>><br>
>> But linking such an object file with lld does not produce the original<br>
>> error so something else is going on.<br>
>><br>
>> > Let me know if more is needed.<br>
>> >><br>
>> >> I recall that this object file is created in a bit unusual way,<br>
>> >> something<br>
>> >> like partially linking several other object files together into this<br>
>> >> one,<br>
>> >> but I will have to dig deeper to say for sure.<br>
>> >><br>
>> ><br>
>> > Yes, it looks like the object file is created in an unusual way, and<br>
>> that<br>
>> > revealed a subtle difference between ld.gold and ld.lld. I want to<br>
>> know<br>
>> > more about that.<br>
>> ><br>
>> ><br>
>> >> Best regards<br>
>> >> Martin<br>
>> >><br>
>> >> Rui Ueyama wrote:<br>
>> >> > Compilers don't know about functions that are not defined in the<br>
>> same<br>
>> >> > compilation unit, so they leave call instruction operands as zero<br>
>> >> (because<br>
>> >> > they can't compute any absolute nor relative address of the<br>
>> >> destinations),<br>
>> >> > and let linkers fix the address by binary patching.<br>
>> >> ><br>
>> >> > So, what you are seeing is likely a bug of LLD that it fails to fix<br>
>> >> the<br>
>> >> > address for some reason.<br>
>> >> ><br>
>> >> > Can you dump that function with `objdump -d -r that-file.o`? With<br>
>> the<br>
>> >> -r<br>
>> >> > option, objdump prints out relocation records. Relocation records<br>
>> are<br>
>> >> the<br>
>> >> > information that linkers use to fix addresses.<br>
>> >> ><br>
>> >> > On Wed, Mar 15, 2017 at 9:25 AM, Martin Richtarsky <<a href="mailto:s@martinien.de">s@martinien.de</a>><br>
>> >> wrote:<br>
>> >> ><br>
>> >> >> Hi all,<br>
>> >> >><br>
>> >> >> I'm currently trying out lld on a large project. We are currently<br>
>> >> using<br>
>> >> >> gold (and used GNU ld before that).<br>
>> >> >><br>
>> >> >> I have come across a few minor issues but could workaround them:<br>
>> >> >> - Missing support for --defsym=symbol1=symbol2,<br>
>> >> >> --warn-unknown-eh-frame-<wbr>section, --exclude-libs<br>
>> >> >><br>
>> >> >> There are two other issues which are more critical, one of which<br>
>> is<br>
>> >> >> currently blocking me, so I would like to find a solution for this<br>
>> >> one<br>
>> >> >> first.<br>
>> >> >><br>
>> >> >> I have a static library that is linked into an executable. The<br>
>> binary<br>
>> >> >> produced by lld crashes, while the gold version runs fine.<br>
>> >> >><br>
>> >> >> The difference is in the call instructions below. The original<br>
>> object<br>
>> >> >> file<br>
>> >> >> from the archive has an address of zero in the call instruction:<br>
>> >> >><br>
>> >> >> 0000000000013832 <func>:<br>
>> >> >> 13832: 55 push %rbp<br>
>> >> >> 13833: 48 89 e5 mov %rsp,%rbp<br>
>> >> >> 13836: 53 push %rbx<br>
>> >> >> 13837: 48 83 ec 18 sub $0x18,%rsp<br>
>> >> >> 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp)<br>
>> >> >> 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax<br>
>> >> >> 13843: 48 89 c7 mov %rax,%rdi<br>
>> >> >> -> 13846: e8 00 00 00 00 callq 1384b <func+0x19><br>
>> >> >> 1384b: 48 8b 45 e8 mov -0x18(%rbp),%rax<br>
>> >> >><br>
>> >> >> gdb displays this as a jump to the next instruction:<br>
>> >> >><br>
>> >> >> 0x0000000000013832 <+0>: push %rbp<br>
>> >> >> 0x0000000000013833 <+1>: mov %rsp,%rbp<br>
>> >> >> 0x0000000000013836 <+4>: push %rbx<br>
>> >> >> 0x0000000000013837 <+5>: sub $0x18,%rsp<br>
>> >> >> 0x000000000001383b <+9>: mov %rdi,-0x18(%rbp)<br>
>> >> >> 0x000000000001383f <+13>: mov -0x18(%rbp),%rax<br>
>> >> >> 0x0000000000013843 <+17>: mov %rax,%rdi<br>
>> >> >> 0x0000000000013846 <+20>: callq 0x1384b <func()+25><br>
>> >> >> 0x000000000001384b <+25>: mov -0x18(%rbp),%rax<br>
>> >> >><br>
>> >> >> However, in the executable linked by gold, the calls are magically<br>
>> >> >> resolved:<br>
>> >> >><br>
>> >> >> 0x000000000018b44e <+0>: push %rbp<br>
>> >> >> 0x000000000018b44f <+1>: mov %rsp,%rbp<br>
>> >> >> 0x000000000018b452 <+4>: push %rbx<br>
>> >> >> 0x000000000018b453 <+5>: sub $0x18,%rsp<br>
>> >> >> 0x000000000018b457 <+9>: mov %rdi,-0x18(%rbp)<br>
>> >> >> 0x000000000018b45b <+13>: mov -0x18(%rbp),%rax<br>
>> >> >> 0x000000000018b45f <+17>: mov %rax,%rdi<br>
>> >> >> 0x000000000018b462 <+20>: callq 0x68568c<br>
>> <std::vector<record,<br>
>> >> >> std::allocator<record> >::vector()><br>
>> >> >> 0x000000000018b467 <+25>: mov -0x18(%rbp),%rax<br>
>> >> >><br>
>> >> >> Even more interesting, several such call instructions with<br>
>> argument 0<br>
>> >> >> are<br>
>> >> >> resolved to different functions. So somewhere there must be<br>
>> >> information<br>
>> >> >> stored to what functions they resolve to.<br>
>> >> >><br>
>> >> >> lld produces this code:<br>
>> >> >><br>
>> >> >> 0x00005555559f304e <+0>: push %rbp<br>
>> >> >> 0x00005555559f304f <+1>: mov %rsp,%rbp<br>
>> >> >> 0x00005555559f3052 <+4>: push %rbx<br>
>> >> >> 0x00005555559f3053 <+5>: sub $0x18,%rsp<br>
>> >> >> 0x00005555559f3057 <+9>: mov %rdi,-0x18(%rbp)<br>
>> >> >> 0x00005555559f305b <+13>: mov -0x18(%rbp),%rax<br>
>> >> >> 0x00005555559f305f <+17>: mov %rax,%rdi<br>
>> >> >> 0x00005555559f3062 <+20>: callq 0x555555554000<br>
>> >> >> 0x00005555559f3067 <+25>: mov -0x18(%rbp),%rax<br>
>> >> >><br>
>> >> >> 0x555555554000 is the start of the mapped region of the<br>
>> executable,<br>
>> >> so<br>
>> >> >> it<br>
>> >> >> seems lld just adds the argument 0 to that without doing any<br>
>> >> relocation<br>
>> >> >> processing.<br>
>> >> >><br>
>> >> >> Is this a known limitation of lld?<br>
>> >> >><br>
>> >> >> Thanks and best regards,<br>
>> >> >> Martin<br>
>> >> >><br>
>> >> > ______________________________<wbr>_________________<br>
>> >> > LLVM Developers mailing list<br>
>> >> > <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
>> >> > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
>> >><br>
>> >><br>
>> > ______________________________<wbr>_________________<br>
>> > LLVM Developers mailing list<br>
>> > <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
>> > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
>> ><br>
>><br>
>><br>
>> --<br>
>> <a href="http://www.martinien.de/" rel="noreferrer" target="_blank">http://www.martinien.de/</a><br>
>><br>
>><br>
>><br>
> ______________________________<wbr>_________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
</div></div></blockquote></div><br></div>