[PATCH] D59553: [LLD][ELF][DebugInfo] llvm-symbolizer shows incorrect source line info if --gc-sections used

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 17 09:25:12 PDT 2019


avl added a comment.

@MaskRay

>> actually other DWARF consumers are not happy.



> Example please.

please check following behavior of lldb, llvm-symbolizer, gnu addr2line, gnu objdump :

$ cat main.cpp
void foo_used();

int main(void) {

  foo_used();
  return 0;

}
$ cat funcs.cpp
void foo_not_used () {

  __asm__(".rept 2105344; nop; .endr");

}

void foo_used () {

  __asm__(".rept 10000; nop; .endr");

}

$ clang++ -gdwarf-4 -O funcs.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O main.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq  %rax
res.out[0x203811] <+1>: callq  0x2010f0                  ; foo_not_used + 240
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl   %eax, %eax
res.out[0x203818] <+8>: popq   %rcx
res.out[0x203819] <+9>: retq

(lldb) b foo_used
Breakpoint 1: where = res.out`foo_not_used() + 240, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:2:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ objdump -d -S res.out
0000000000201000 <_start>:
void foo_not_used () {

  __asm__(".rept 2105344; nop; .endr");

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  201000:       31 ed                   xor    %ebp,%ebp
  201002:       49 89 d1                mov    %rdx,%r9
  201005:       5e                      pop    %rsi

And this patch makes them happy:

$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq  %rax
res.out[0x203811] <+1>: callq  0x2010f0                  ; foo_used at funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl   %eax, %eax
res.out[0x203818] <+8>: popq   %rcx
res.out[0x203819] <+9>: retq

(lldb) b foo_used
Breakpoint 1: where = res.out`foo_used() at funcs.cpp:6:5, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ objdump -d -S res.out
0000000000201000 <_start>:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  201000:       31 ed                   xor    %ebp,%ebp
  201002:       49 89 d1                mov    %rdx,%r9
  201005:       5e                      pop    %rsi
  201006:       48 89 e2                mov    %rsp,%rdx
  201009:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp

>> I checked behavior of addr2line and gdb -they are working correctly with this patch.



> Can you check if gdb works without this patch?

gdb works correctly without this patch for cases when zero virtual address is not a correct value for code.
Because it uses 0 as special value indicating that address range is incorrect. 
It would not work correctly for the platform where 0 is a valid vma for code.
It is generally wrong to use 0 as a special value.

Using 0 as the beginning of the section with code is not forbidden.

> I want to see a concrete example where 0 is used as a valid DW_AT_low_pc.



1. according to @jhenderson Sony has/would have in future such use case.
2. according to this http://lists.llvm.org/pipermail/lldb-dev/2017-March/012091.html  Qualcomm Kalimba DSP and the XAP RISC CPU have this use case. they even use similar to this patch approach to solve the problem.
3. Such use case often used for embedded systems.

I personally do not have access to such systems to demonstrate a use case for you.

>> lld generated executable would have a problem. ld and gold generated would not.



> You may check how the R and RW PT_LOAD segments are laid out in ld.bfd and gold linked modules.

For the above example :

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=ld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000004004a0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=gold -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x0000000000400510
/home/avl/bugs/gc_debuginfo/funcs.cpp:6

lld generated binary has problem, ld and gold do not have.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59553/new/

https://reviews.llvm.org/D59553





More information about the llvm-commits mailing list