[llvm-dev] lldb subprogram ranges support

Sriraman Tallam via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 30 18:52:36 PST 2020


On Tue, Dec 29, 2020 at 4:44 PM Sriraman Tallam <tmsriram at google.com> wrote:

>
>
> On Tue, Dec 29, 2020 at 2:06 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Wed, Dec 23, 2020 at 7:02 PM Sriraman Tallam <tmsriram at google.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Dec 23, 2020 at 4:46 PM David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>> Hey folks,
>>>>
>>>> So I've been doing some more testing/implementation work on various
>>>> address pool reduction strategies previously discussed back in January (
>>>> http://lists.llvm.org/pipermail/llvm-dev/2020-January/thread.html#138029
>>>> ).
>>>>
>>>> I've committed a -mllvm flag to allow experimenting with the first of
>>>> these strategies: Always using ranges in DWARFv5 (the flag has no effect
>>>> pre-v5). Since ranges can use address pool entries, this allows significant
>>>> address reuse (clang opt split-dwarf 13% reduction in object file size,
>>>> specifically a reduction in aggregate .rela.debug_addr size from 78MB to
>>>> 16MB - the lowest this could go is approximately 8MB (this is the size of
>>>> .rela.debug_line)).
>>>>
>>>> It causes one lldb test to
>>>> fail lldb/test/SymbolFile/DWARF/Output/debug-types-expressions.test which
>>>> reveals that lldb has some trouble with ranges on DW_TAG_subprograms.
>>>>
>>>> Anyone happen to have ideas about what the problem might be? Anyone
>>>> interested in fixing this? (Jordan, maybe?)
>>>>
>>>> Sri: Sounded like you folks had done some testing of Propeller with
>>>> lldb - and I'd expect it to trip over this same problem, since it'll cause
>>>> ranges to be used for DW_TAG_subprograms to an even greater degree. Have
>>>> you come across anything like this?
>>>>
>>>
>>> Not sure David.  I think you tested basic block sections for v5 a while
>>> back.
>>>
>>
>> I'd looked at the DWARF being well-formed & for the most part efficient
>> as it can be, given the nature of Basic Block Sections - but I haven't done
>> any debugger testing with it.
>>
>> You mentioned gdb might already be pretty well setup for functions that
>> are split into multiple chunks because GCC does this under some
>> circumstances?
>>
>> But it looks like lldb might not be so well situated.
>>
>>
>>>   How do I test if this breaks with bbsections?
>>>
>>
>> Test printing out the value of a variable in a function with more than
>> one section, eg:
>>
>> $ ~/dev/llvm/build/default/bin/lldb ./b
>>
>> (lldb) target create "./b"
>>
>> Current executable set to '/usr/local/google/home/blaikie/dev/scratch/b'
>> (x86_64).
>>
>> (lldb) b main
>>
>> Breakpoint 1: where = b`main + 15, address = 0x000000000040112f
>>
>> (lldb) start
>>
>> *error: *'start' is not a valid command.
>>
>> (lldb) r
>>
>> Process 1827628 launched: '/usr/local/google/home/blaikie/dev/scratch/b'
>> (x86_64)
>>
>> Process 1827628 stopped
>>
>> * thread #1, name = 'b', stop reason = breakpoint 1.1
>>
>>     frame #0: 0x000000000040112f b`main at test.cpp:5:7
>>
>>    2      int j = 12;
>>
>>    3    }
>>
>>    4    int main() {
>>
>> -> 5      int i = 7;
>>
>>    6      if (i)
>>
>>    7        f1();
>>
>>    8    }
>>
>> (lldb) p i
>>
>> error: <user expression 0>:1:1: use of undeclared identifier 'i'
>>
>> i
>>
>> ^
>>
>> (lldb) ^D
>>
>> $ clang++-tot test.cpp -g -o b
>>
>> $ ~/dev/llvm/build/default/bin/lldb ./b
>>
>> (lldb) target create "./b"
>>
>> Current executable set to '/usr/local/google/home/blaikie/dev/scratch/b'
>> (x86_64).
>>
>> (lldb) b main
>>
>> Breakpoint 1: where = b`main + 15 at test.cpp:5:7, address =
>> 0x000000000040112f
>>
>> (lldb) r
>>
>> Process 1828108 launched: '/usr/local/google/home/blaikie/dev/scratch/b'
>> (x86_64)
>>
>> p i
>>
>> Process 1828108 stopped
>>
>> * thread #1, name = 'b', stop reason = breakpoint 1.1
>>
>>     frame #0: 0x000000000040112f b`main at test.cpp:5:7
>>
>>    2      int j = 12;
>>
>>    3    }
>>
>>    4    int main() {
>>
>> -> 5      int i = 7;
>>
>>    6      if (i)
>>
>>    7        f1();
>>
>>    8    }
>>
>> (lldb) p i
>>
>> (int) $0 = 0
>>
>> (lldb) ^D
>>
>> $ cat test.cpp
>>
>> void f1() {
>>
>>   int j = 12;
>>
>> }
>>
>> int main() {
>>
>>   int i = 7;
>>
>>   if (i)
>>
>>     f1();
>>
>> }
>>
>> So, yeah, seems like DW_AT_ranges on a DW_TAG_subprogram is a bit buggy
>> with lldb & that'll need to be fixed for Propeller to be usable with lldb.
>> For my "ranges everywhere" feature - nice to fix, but given we/Google/my
>> use case uses -ffunction-sections, subprogram ranges don't actually ever
>> get used in that situation (since every function starts at a new relocated
>> address - subprogram address ranges can't share address pool entries anyway
>> - so they never get DW_AT_ranges in this case), so I could tweak
>> ranges-everywhere to not apply to subprogram ranges for now to keep it more
>> usable/unsurprising.
>>
>>
>>> I can give you a simple program with bb sections that would create a lot
>>> of ranges. Any pointers? My understanding of DWARF v5 is near zero so
>>> please bear with me. Thanks.
>>>
>>
>> This applies to DWARFv4 as well, as shown above - sorry for the confusion
>> there. I happened to be experimenting with DWARFv5 range features - but it
>> shows lldb has some problems with ranges on subprograms in general (& even
>> if the ranges only contains a single range (expressed with a range list,
>> rather than with low/high pc) it still breaks)
>>
>

Just one more data point regarding gcc which uses DW_AT_ranges even with
DW_TAG_subprogram as it independently implements the function splitting
feature we did in LLVM.  Here is how to generate this with gcc:

#include <stdio.h>
const int LOOP_BOUND = 200000000;
__attribute__((noinline))
static int work(bool b) {
  if (b) {
    for (int i=0; i<LOOP_BOUND; ++i) {
      printf("Hot\n");
    }
  } else {
    for (int i=0; i<LOOP_BOUND; ++i) {
      printf("Cold\n");
    }
  }
  return 0;
}
int main(int argc, char* argv[]) {
    int result = work((argc > 3));
    return result;
}

$ g++ -O2 gccsplit.cc -fprofile-generate
$ ./a.out > /dev/null
$ ls *.gcda
$ g++ -O2 gccsplit.cc -freorder-blocks-and-partition -g -fprofile-use -c
$ llvm-dwarfdump gccsplit.o

0x000000ff:   DW_TAG_subprogram
                DW_AT_name      ("work")
                DW_AT_decl_file ("gccsplit.cc")
                DW_AT_decl_line (5)
                DW_AT_decl_column       (0x0c)
                DW_AT_type      (0x00000053 "int")
                DW_AT_ranges    (0x00000000
                   [0x0000000000000000, 0x000000000000006b)
                   [0x0000000000000000, 0x000000000000001e))
                DW_AT_frame_base        (DW_OP_call_frame_cfa)
                DW_AT_GNU_all_call_sites        (true)
                DW_AT_sibling   (0x00000234)


This is also exactly how LLVM splits functions and generates debug info.
Probably explains why gdb is able to work with this example.

Thanks
Sri



>
> Thanks David, so I tried this slightly more complicated program with
> bbsections on LLDB and GDB.  TLDR;  This seems like a LLDB bug and seems
> fine with GDB.
>
> I used a slightly modified example and split the bb sections to be
> discontiguous FWIW.
>
> inline __attribute__((noinline)) void f1(int j) {}
> int main() {
>   int i = 7;
>   if (i) {
>      int j = 8;
>      f1(j);
>   }
> }
>
> $ cat syms.txt
> main.__part.1
> _start
> main
> _Z2f1i
> main.__part.2
>
> $ /g/tmsriram/Projects_2019/llvm_trunk_upstream/release_build/bin/clang++
> -fbasic-block-sections=all -g test.cc -O0
> -Wl,--symbol-ordering-file=syms.txt -fuse-ld=lld
> $ nm -n a.out
> 0000000000201640 t main.__part.1
> 0000000000201660 T _start
> 0000000000201690 t _dl_relocate_static_pie
> 00000000002016a0 T main
> 00000000002016d0 W _Z2f1i
> 00000000002016d9 t main.__part.2
>
> $ gdb ./a.out
>
> (gdb) b main
> Breakpoint 1 at 0x201647: main. (2 locations)
> (gdb) r
> Starting program:
> /g/tmsriram/Projects_2019/github_repo/Examples/bbsections_bug/a.out
>
> Breakpoint 1, main () at test.cc:2
> 2 int main() {
> ...
> (gdb) n
> 4  if (i) {
> (gdb) p i
> $3 = 7
> (gdb) n
> 5     int j = 8;
> (gdb) p j
> $4 = 32767
> (gdb) n
> Breakpoint 1, main () at test.cc:6
> 6     f1(j);
> (gdb) p j
> $5 = 8
>
> lldb fails like how you said.
>
> Thanks
> Sri
>
>
>
>
>
>>
>>
>>>
>>>
>>>>
>>>> Here's a small example:
>>>>
>>>> (the test has an inline function to force the output file to have more
>>>> than one section (otherwise it'll all be in the text section, the CU's
>>>> low_pc will be relocatable and all the other addresses will be relative to
>>>> that - so there won't be any benefit to using ranges) and 'main' is the
>>>> second function, so it starts at an offset relative to the address in the
>>>> address pool (which will be f2's starting address) and benefit from using
>>>> ranges to share that address)
>>>>
>>>> $ cat test.cpp
>>>>
>>>> inline __attribute__((noinline)) void f1() { }
>>>>
>>>> void f2() {
>>>>
>>>> }
>>>>
>>>> int main() {
>>>>
>>>>   int i = 7;
>>>>
>>>>   f1();
>>>>
>>>> }
>>>> $ ~/dev/llvm/build/default/bin/lldb ./a
>>>>
>>>> (lldb) target create "./a"
>>>>
>>>> Current executable set to
>>>> '/usr/local/google/home/blaikie/dev/scratch/always_ranges/a' (x86_64).
>>>>
>>>> (lldb) b main
>>>>
>>>> Breakpoint 1: where = a`main + 8 at test.cpp:5:7, address =
>>>> 0x0000000000401128
>>>>
>>>> (lldb) r
>>>>
>>>> Process 2271305 launched:
>>>> '/usr/local/google/home/blaikie/dev/scratch/always_ranges/a' (x86_64)
>>>>
>>>> p iProcess 2271305 stopped
>>>>
>>>> * thread #1, name = 'a', stop reason = breakpoint 1.1
>>>>
>>>>     frame #0: 0x0000000000401128 a`main at test.cpp:5:7
>>>>
>>>>    2    void f2() {
>>>>
>>>>    3    }
>>>>
>>>>    4    int main() {
>>>>
>>>> -> 5      int i = 7;
>>>>
>>>>    6      f1();
>>>>
>>>>    7    }
>>>>
>>>> (lldb) p i
>>>>
>>>> (int) $0 = 0
>>>>
>>>> $ ~/dev/llvm/build/default/bin/lldb ./b
>>>>
>>>> (lldb) target create "./b"
>>>>
>>>> Current executable set to
>>>> '/usr/local/google/home/blaikie/dev/scratch/always_ranges/b' (x86_64).
>>>>
>>>> (lldb) b main
>>>>
>>>> Breakpoint 1: where = b`main + 8, address = 0x0000000000401128
>>>>
>>>> (lldb) r
>>>>
>>>> Process 2271759 launched:
>>>> '/usr/local/google/home/blaikie/dev/scratch/always_ranges/b' (x86_64)
>>>>
>>>> Process 2271759 stopped
>>>>
>>>> * thread #1, name = 'b', stop reason = breakpoint 1.1
>>>>
>>>>     frame #0: 0x0000000000401128 b`main at test.cpp:5:7
>>>>
>>>>    2    void f2() {
>>>>
>>>>    3    }
>>>>
>>>>    4    int main() {
>>>>
>>>> -> 5      int i = 7;
>>>>
>>>>    6      f1();
>>>>
>>>>    7    }
>>>>
>>>> (lldb) p i
>>>>
>>>> error: <user expression 0>:1:1: use of undeclared identifier 'i'
>>>>
>>>> i
>>>>
>>>> ^
>>>>
>>>> $ diff <(llvm-dwarfdump-tot a | sed -e "s/0x[0-9a-f]\{8\}//g")
>>>> <(llvm-dwarfdump-tot b | sed -e "s/0x[0-9a-f]\{8\}//g")
>>>>
>>>> 1c1
>>>>
>>>> < a:    file format elf64-x86-64
>>>>
>>>> ---
>>>>
>>>> > b:    file format elf64-x86-64
>>>>
>>>> 14c14
>>>>
>>>> <               DW_AT_ranges    (indexed (0x0) rangelist =
>>>>
>>>> ---
>>>>
>>>> >               DW_AT_ranges    (indexed (0x1) rangelist =
>>>>
>>>> 31,32c31,32
>>>>
>>>> <                 DW_AT_low_pc  (00401120)
>>>>
>>>> <                 DW_AT_high_pc (0040113c)
>>>>
>>>> ---
>>>>
>>>> >                 DW_AT_ranges  (indexed (0x0) rangelist =
>>>>
>>>> >                    [00401120, 0040113c))
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201230/19d5ed2b/attachment.html>


More information about the llvm-dev mailing list