[lldb-dev] Making a new symbol provider

Tue Mar 1 10:33:19 PST 2016

> On Feb 29, 2016, at 5:51 PM, Zachary Turner <zturner at google.com> wrote:
> 
> 
> 
> On Mon, Feb 29, 2016 at 5:49 PM Zachary Turner <zturner at google.com> wrote:
> Those are addresses.  Here's the situation I was encountering this on:
> 
> // foo.h
> #include "bar.h"
> inline int f(int n)
> {
>     return g(n) + 1;
> }
> 
> // bar.h
> inline int g(int n)
> {
>     return n+1;
> }
> 
> // foo.cpp
> #include "foo.h"
> int main(int argc, char** argv)
> {
>     return f(argc);
> }
> 
> PDB gives me back line numbers and address range grouped by file.  So I get all of foo.h's lines, all of bar.h's lines, and all of foo.cpp's lines.  In sorted form, the lines for g will appear inside the sequence of lines for f.  So that's how the situation was arising.
> 
> Just to clarify here.  When I was encountering this problem, I would create one LineSequence for foo.h's lines, one LineSequence for bar.h's lines, and one for foo.cpp's.  And each one is monotonically increasing, but the ranges can overlap as per the previous explanation, which was causing InsertLineSequence to fail. 

I understand now. Yes, you will need to parse all line entries one big buffer, sort them by address, and then figure out what sequences to submit after this.

Is there a termination entry for the last line entry in a function? Lets say there were 4096 byte gaps between "f" and "g" and "main"? Are there termination entries for the last '}' in each function so that when you put all of the line entries into one large collection and sort them by address, that you know there is a gap between the line entries? This is very important to get right. If there aren't termination entries, you will need to add them manually by looking up each line entry address and find the address range of the function (which you can cache at the time of making the line sequences from the sorted PDB line entries) and add termination entries for the ends of functions. So lets say f starts at 0x1000 and the "inline int f" is on line 3, g starts at 0x2000 and main starts at 0x3000, you don't want you line table looking like a single sequence:

0x1000: foo.cpp line 4  // {
0x1010: foo.cpp line 5  //     return g(n) + 1;
0x1020: foo.cpp line 6  // }
0x2000: foo.cpp line 10 // {
0x2010: foo.cpp line 11 //     return n+1;
0x2020: foo.cpp line 12 // }
0x3000: foo.cpp line 17 // {
0x3010: foo.cpp line 18 //     return f(argc);
0x3020: foo.cpp line 19 // }

If you don't have termination entries, we will think foo.cpp:6 goes from [0x1020-0x2000) which is probably now what we want.

There should be termination entries between the functions so that the line entries do not contain gaps between functions in their address ranges. So you should actually have 3 sequences in the line table:

0x1000: foo.cpp line 4  // {
0x1010: foo.cpp line 5  //     return g(n) + 1;
0x1020: foo.cpp line 6  // }
0x1030: END

0x2000: foo.cpp line 10 // {
0x2010: foo.cpp line 11 //     return n+1;
0x2020: foo.cpp line 12 // }
0x2030: END

0x3000: foo.cpp line 17 // {
0x3010: foo.cpp line 18 //     return f(argc);
0x3020: foo.cpp line 19 // }
0x3030: END

0x1030, 0x2030 and 0x3030 are the end addresses of the functions f, g and main respectively. So if your line table only contains start addresses, you will need to inject these correctly otherwise source level single step can do the wrong thing since it uses line entry address ranges to implement the steps.

Greg