[lldb-dev] Question about building line tables

Greg Clayton via lldb-dev lldb-dev at lists.llvm.org
Tue Mar 8 15:17:24 PST 2016


If the PDB line tables have sizes you could do:

PDBLineEntry curr = pdb->GetLineEntry(n);
PDBLineEntry next = pdb->GetLineEntry(n+1);

line_entries.insert(curr.addr, curr.line, curr.file, false /*is_terminal*/);

if (curr.addr + curr.size != next.addr)
{
    // Insert terminal entry for end of curr
    line_entries.insert(curr.addr + curr.size, curr.line, curr.file, true /*is_terminal*/);
}

Above is pseudo code, but you get the idea...

> On Mar 8, 2016, at 12:56 PM, Zachary Turner <zturner at google.com> wrote:
> 
> Let's suppose I've got this function (ignore the operands to branch instructions, I disassembled a real function and just manually adjusted addresses on the left side only just to create a contrived example).  
> 
> infinite-dwarf.exe`main at infinite.cpp:5
>    4
>    5    int main(int argc, char **argv) {
>    6        int n = 0;
> infinite-dwarf.exe`main:
> infinite-dwarf.exe[0x410000] <+0>:   55                    push   ebp
> infinite-dwarf.exe[0x410001] <+1>:   89 e5                 mov    ebp, esp
> infinite-dwarf.exe[0x410003] <+3>:   83 ec 18              sub    esp, 0x18
> infinite-dwarf.exe[0x410006] <+6>:   8b 45 0c              mov    eax, dword ptr [ebp + 0xc]
> infinite-dwarf.exe[0x410009] <+9>:   8b 4d 08              mov    ecx, dword ptr [ebp + 0x8]
> infinite-dwarf.exe[0x41000c] <+12>:  c7 45 fc 00 00 00 00  mov    dword ptr [ebp - 0x4], 0x0
> infinite-dwarf.exe[0x410013] <+19>:  89 45 f8              mov    dword ptr [ebp - 0x8], eax
> infinite-dwarf.exe[0x410016] <+22>:  89 4d f4              mov    dword ptr [ebp - 0xc], ecx
> infinite-dwarf.exe`main + 25 at infinite.cpp:6
>    5    int main(int argc, char **argv) {
>    6        int n = 0;
>    7        while (n < 10) {
> infinite-dwarf.exe[0x410019] <+25>:  c7 45 f0 00 00 00 00  mov    dword ptr [ebp - 0x10], 0x0
> infinite-dwarf.exe`main + 32 at infinite.cpp:7
>    6        int n = 0;
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
> infinite-dwarf.exe[0x410020] <+32>:  83 7d f0 0a           cmp    dword ptr [ebp - 0x10], 0xa
> infinite-dwarf.exe`main + 36 at infinite.cpp:7
>    6        int n = 0;
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
> infinite-dwarf.exe[0x410024] <+36>:  0f 8d 4a 00 00 00     jge    0x410074 
> infinite-dwarf.exe`main + 42 at infinite.cpp:8
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
>    9            Sleep(1000);
> infinite-dwarf.exe[0x41002a] <+42>:  8b 45 f0              mov    eax, dword ptr [ebp - 0x10]
> infinite-dwarf.exe`main + 45 at infinite.cpp:8
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
>    9            Sleep(1000);
> infinite-dwarf.exe[0x41002d] <+45>:  89 e1                 mov    ecx, esp
> infinite-dwarf.exe[0x41002f] <+47>:  89 01                 mov    dword ptr [ecx], eax
> infinite-dwarf.exe[0x410031] <+49>:  b9 80 c1 40 00        mov    ecx, 0x40c180
> infinite-dwarf.exe[0x410036] <+54>:  e8 55 0a 00 00        call   0x410a90
> infinite-dwarf.exe[0x41003b] <+59>:  83 ec 04              sub    esp, 0x4
> infinite-dwarf.exe`main + 62 at infinite.cpp:8
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
>    9            Sleep(1000);
> infinite-dwarf.exe[0x41003e] <+62>:  89 e1                 mov    ecx, esp
> infinite-dwarf.exe[0x410040] <+64>:  c7 01 50 0d 41 00     mov    dword ptr [ecx], 0x410d50
> infinite-dwarf.exe[0x410046] <+70>:  89 c1                 mov    ecx, eax
> infinite-dwarf.exe[0x410048] <+72>:  e8 e3 0c 00 00        call   0x410d30
> infinite-dwarf.exe[0x41004d] <+77>:  83 ec 04              sub    esp, 0x4
> 
> 
> ; function becomes discontiguous here
> 
> 
> 
> infinite-dwarf.exe`main + 80 at infinite.cpp:9
>    8            std::cout << n << std::endl;
>    9            Sleep(1000);
>    10           n++;
> infinite-dwarf.exe[0x510050] <+80>:  89 e1                 mov    ecx, esp
> infinite-dwarf.exe[0x510052] <+82>:  c7 01 e8 03 00 00     mov    dword ptr [ecx], 0x3e8
> infinite-dwarf.exe[0x510058] <+88>:  8b 0d 04 93 43 00     mov    ecx, dword ptr [0x439304]
> infinite-dwarf.exe[0x51005e] <+94>:  89 45 ec              mov    dword ptr [ebp - 0x14], eax
> infinite-dwarf.exe[0x510061] <+97>:  ff d1                 call   ecx
> infinite-dwarf.exe[0x510063] <+99>:  83 ec 04              sub    esp, 0x4
> infinite-dwarf.exe`main + 102 at infinite.cpp:10
>    9            Sleep(1000);
>    10           n++;
>    11       }
> infinite-dwarf.exe[0x510066] <+102>: 8b 45 f0              mov    eax, dword ptr [ebp - 0x10]
> infinite-dwarf.exe[0x510069] <+105>: 83 c0 01              add    eax, 0x1
> infinite-dwarf.exe[0x51006c] <+108>: 89 45 f0              mov    dword ptr [ebp - 0x10], eax
> infinite-dwarf.exe`main + 111 at infinite.cpp:7
>    6        int n = 0;
>    7        while (n < 10) {
>    8            std::cout << n << std::endl;
> infinite-dwarf.exe[0x51006f] <+111>: e9 ac ff ff ff        jmp    0x410020
> infinite-dwarf.exe[0x510074] <+116>: 31 c0                 xor    eax, eax
> infinite-dwarf.exe`main + 118 at infinite.cpp:13
>    12
>    13       return 0;
>    14   }
> infinite-dwarf.exe[0x510076] <+118>: 83 c4 18              add    esp, 0x18
> infinite-dwarf.exe[0x510079] <+121>: 5d                    pop    ebp
> infinite-dwarf.exe[0x51007a] <+122>: c3                    ret
> 
> 
> About halfway down, the addresses suddenly increase by 0x100000.  So the compiler decided that for some strange reason while unrolling the loop it was just going to start placing code somewhere else entirely.  Am I correct in saying that 0x410050 should be a terminal entry in this example?
> 
> On Mon, Mar 7, 2016 at 3:31 PM Greg Clayton <gclayton at apple.com> wrote:
> 
> > On Mar 7, 2016, at 3:21 PM, Zachary Turner <zturner at google.com> wrote:
> >
> > Does DWARF not store this information?  Because it seems like it could be efficiently stored in an interval tree, the question is just whether it is efficient to convert what DWARF stores into that format.
> 
> No it stores it just like we do, but in a compressed format that is useless for searching.
> 
> > PDB returns line entries in the format I described, with a start address and a byte length, so to determine whether something is a terminal entry I have to add them to some kind of data structure that collapses ranges and then manually scan through for breaks in the continuity of the range.
> >
> > Is there some way we can make this more generic so that it's efficient for both DWARF and PDB?
> 
> We need an efficient memory format that LLDB can use to search things, which is how things currently are done: all plug-ins are expected to parse debug info and make a series of lldb_private::LineTable::Entry structs.
> 
> We could defer this functionality into the plug-ins directly where you always must say "hey SymbolFile, here is a section offset address, please get me the lldb_private::LineEntry:
> 
> bool
> SymbolFile::GetLineEntryForAddress (const lldb_private::Address &addr, lldb_private::LineEntry &line_entry);
> 
> The thing I don't like about this approach where we don't supply the format we want the line tables to be in is this does make it quite painful to iterate over all line table entries for a compile unit. You would need to get the address range for all functions in a compile unit, then make a loop that would iterate through all addresses and try to lookup each address to find the lldb_private::LineEntry for that address. Right now we just get the LineTable from the compile unit and say "bool LineTable::GetLineEntryAtIndex(uint32_t idx, LineEntry &line_entry);".
> 
> 



More information about the lldb-dev mailing list