[lldb-dev] Question about building line tables

Zachary Turner via lldb-dev lldb-dev at lists.llvm.org
Tue Mar 8 12:56:06 PST 2016


Let's suppose I've got this function (ignore the operands to branch
instructions, I disassembled a real function and just manually adjusted
addresses on the left side only just to create a contrived example).

infinite-dwarf.exe`main at infinite.cpp:5
   4
   5    int main(int argc, char **argv) {
   6        int n = 0;
infinite-dwarf.exe`main:
infinite-dwarf.exe[0x410000] <+0>:   55                    push   ebp
infinite-dwarf.exe[0x410001] <+1>:   89 e5                 mov    ebp, esp
infinite-dwarf.exe[0x410003] <+3>:   83 ec 18              sub    esp, 0x18
infinite-dwarf.exe[0x410006] <+6>:   8b 45 0c              mov    eax,
dword ptr [ebp + 0xc]
infinite-dwarf.exe[0x410009] <+9>:   8b 4d 08              mov    ecx,
dword ptr [ebp + 0x8]
infinite-dwarf.exe[0x41000c] <+12>:  c7 45 fc 00 00 00 00  mov    dword ptr
[ebp - 0x4], 0x0
infinite-dwarf.exe[0x410013] <+19>:  89 45 f8              mov    dword ptr
[ebp - 0x8], eax
infinite-dwarf.exe[0x410016] <+22>:  89 4d f4              mov    dword ptr
[ebp - 0xc], ecx
infinite-dwarf.exe`main + 25 at infinite.cpp:6
   5    int main(int argc, char **argv) {
   6        int n = 0;
   7        while (n < 10) {
infinite-dwarf.exe[0x410019] <+25>:  c7 45 f0 00 00 00 00  mov    dword ptr
[ebp - 0x10], 0x0
infinite-dwarf.exe`main + 32 at infinite.cpp:7
   6        int n = 0;
   7        while (n < 10) {
   8            std::cout << n << std::endl;
infinite-dwarf.exe[0x410020] <+32>:  83 7d f0 0a           cmp    dword ptr
[ebp - 0x10], 0xa
infinite-dwarf.exe`main + 36 at infinite.cpp:7
   6        int n = 0;
   7        while (n < 10) {
   8            std::cout << n << std::endl;
infinite-dwarf.exe[0x410024] <+36>:  0f 8d 4a 00 00 00     jge    0x410074
infinite-dwarf.exe`main + 42 at infinite.cpp:8
   7        while (n < 10) {
   8            std::cout << n << std::endl;
   9            Sleep(1000);
infinite-dwarf.exe[0x41002a] <+42>:  8b 45 f0              mov    eax,
dword ptr [ebp - 0x10]
infinite-dwarf.exe`main + 45 at infinite.cpp:8
   7        while (n < 10) {
   8            std::cout << n << std::endl;
   9            Sleep(1000);
infinite-dwarf.exe[0x41002d] <+45>:  89 e1                 mov    ecx, esp
infinite-dwarf.exe[0x41002f] <+47>:  89 01                 mov    dword ptr
[ecx], eax
infinite-dwarf.exe[0x410031] <+49>:  b9 80 c1 40 00        mov    ecx,
0x40c180
infinite-dwarf.exe[0x410036] <+54>:  e8 55 0a 00 00        call   0x410a90
infinite-dwarf.exe[0x41003b] <+59>:  83 ec 04              sub    esp, 0x4
infinite-dwarf.exe`main + 62 at infinite.cpp:8
   7        while (n < 10) {
   8            std::cout << n << std::endl;
   9            Sleep(1000);
infinite-dwarf.exe[0x41003e] <+62>:  89 e1                 mov    ecx, esp
infinite-dwarf.exe[0x410040] <+64>:  c7 01 50 0d 41 00     mov    dword ptr
[ecx], 0x410d50
infinite-dwarf.exe[0x410046] <+70>:  89 c1                 mov    ecx, eax
infinite-dwarf.exe[0x410048] <+72>:  e8 e3 0c 00 00        call   0x410d30
*infinite-dwarf.exe[0x41004d] <+77>:  83 ec 04              sub    esp, 0x4*


; function becomes discontiguous here



infinite-dwarf.exe`main + 80 at infinite.cpp:9
   8            std::cout << n << std::endl;
   9            Sleep(1000);
   10           n++;
infinite-dwarf.exe[0x510050] <+80>:  89 e1                 mov    ecx, esp
infinite-dwarf.exe[0x510052] <+82>:  c7 01 e8 03 00 00     mov    dword ptr
[ecx], 0x3e8
infinite-dwarf.exe[0x510058] <+88>:  8b 0d 04 93 43 00     mov    ecx,
dword ptr [0x439304]
infinite-dwarf.exe[0x51005e] <+94>:  89 45 ec              mov    dword ptr
[ebp - 0x14], eax
infinite-dwarf.exe[0x510061] <+97>:  ff d1                 call   ecx
infinite-dwarf.exe[0x510063] <+99>:  83 ec 04              sub    esp, 0x4
infinite-dwarf.exe`main + 102 at infinite.cpp:10
   9            Sleep(1000);
   10           n++;
   11       }
infinite-dwarf.exe[0x510066] <+102>: 8b 45 f0              mov    eax,
dword ptr [ebp - 0x10]
infinite-dwarf.exe[0x510069] <+105>: 83 c0 01              add    eax, 0x1
infinite-dwarf.exe[0x51006c] <+108>: 89 45 f0              mov    dword ptr
[ebp - 0x10], eax
infinite-dwarf.exe`main + 111 at infinite.cpp:7
   6        int n = 0;
   7        while (n < 10) {
   8            std::cout << n << std::endl;
infinite-dwarf.exe[0x51006f] <+111>: e9 ac ff ff ff        jmp    0x410020
infinite-dwarf.exe[0x510074] <+116>: 31 c0                 xor    eax, eax
infinite-dwarf.exe`main + 118 at infinite.cpp:13
   12
   13       return 0;
   14   }
infinite-dwarf.exe[0x510076] <+118>: 83 c4 18              add    esp, 0x18
infinite-dwarf.exe[0x510079] <+121>: 5d                    pop    ebp
infinite-dwarf.exe[0x51007a] <+122>: c3                    ret


About halfway down, the addresses suddenly increase by 0x100000.  So the
compiler decided that for some strange reason while unrolling the loop it
was just going to start placing code somewhere else entirely.  Am I correct
in saying that 0x410050 should be a terminal entry in this example?

On Mon, Mar 7, 2016 at 3:31 PM Greg Clayton <gclayton at apple.com> wrote:

>
> > On Mar 7, 2016, at 3:21 PM, Zachary Turner <zturner at google.com> wrote:
> >
> > Does DWARF not store this information?  Because it seems like it could
> be efficiently stored in an interval tree, the question is just whether it
> is efficient to convert what DWARF stores into that format.
>
> No it stores it just like we do, but in a compressed format that is
> useless for searching.
>
> > PDB returns line entries in the format I described, with a start address
> and a byte length, so to determine whether something is a terminal entry I
> have to add them to some kind of data structure that collapses ranges and
> then manually scan through for breaks in the continuity of the range.
> >
> > Is there some way we can make this more generic so that it's efficient
> for both DWARF and PDB?
>
> We need an efficient memory format that LLDB can use to search things,
> which is how things currently are done: all plug-ins are expected to parse
> debug info and make a series of lldb_private::LineTable::Entry structs.
>
> We could defer this functionality into the plug-ins directly where you
> always must say "hey SymbolFile, here is a section offset address, please
> get me the lldb_private::LineEntry:
>
> bool
> SymbolFile::GetLineEntryForAddress (const lldb_private::Address &addr,
> lldb_private::LineEntry &line_entry);
>
> The thing I don't like about this approach where we don't supply the
> format we want the line tables to be in is this does make it quite painful
> to iterate over all line table entries for a compile unit. You would need
> to get the address range for all functions in a compile unit, then make a
> loop that would iterate through all addresses and try to lookup each
> address to find the lldb_private::LineEntry for that address. Right now we
> just get the LineTable from the compile unit and say "bool
> LineTable::GetLineEntryAtIndex(uint32_t idx, LineEntry &line_entry);".
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160308/3372a8ac/attachment.html>


More information about the lldb-dev mailing list