[Lldb-commits] [PATCH] D12291: Add split dwarf support to SymbolFileDWARF

Tue Aug 25 10:27:10 PDT 2015

tberghammer added a comment.

Sorry, first I manage to submit my comments without finishing them.

In http://reviews.llvm.org/D12291#232191, @clayborg wrote:

> I also question why Symbo
>
> In http://reviews.llvm.org/D12291#231523, @tberghammer wrote:
>
> > In the current version of the patch the compile units in the main object file hands out only the compile unit DIE with the information what is available in the main object file. I considered the other approach (hand out all DIEs by the DWARF compile unit in the main object file) but I dropped it for the following reasons:
> >
> > - If we hand out DIEs from a dwo symbol file then each DIE have to store a pointer to the symbol file (or compile unit) it belongs to what is a significant memory overhead (we have more DIEs then Symbols) if we ever want to store all DIE in memory. Even worse is that we have to change all name to DIE index to contain a pointer (compile unit / dwo symbol file) and an offset to find the DIE belongs to (compared to just an offset now) what is more entry then the number DIEs.
>
>
> Can't we just store the SymbolFile inside the compile unit? We always need the compile unit to really do anything with a DIE anyway. We currently store the DWO file inside the compile unit so it seems that we could just store it once in the compile unit and avoid any extra cost.

In the name to DIE indexes (in SymbolFileDWARF) currently we store only a DIE offset and we find the compile unit based on the fact that the DIE should be inside the range of the compile unit. The compile units leave in the dwo symbol files all start at address 0 so just a DIE offset isn't enough to find the compile unit. We can store the offset in 4 byte (we already do it, but I am not sure it is a good idea) and the compile unit index in another 4 byte what isn't a major overhead, but it can matter for large inferiors. Storing the symbol file  in the DIE might be avoidable but then the DIE have to ask the compile unit for the correct symbol file when somebody queries it for an attribute (we don't want the caller of the GetAttribute* function to know about dwo files).

> 

> 

> > - In an average debug session run from an IDE the user usually sets the breakpoints based on file name + line number, display some stack traces, some variables and do some stepping. If we can index each dwo file separately then we can handle all of these features without parsing the full debug info what can give us some significant speed benefits at debugger startup time.

> 

> 

> I don't see how we can ever just index one DWO file? If we index one, we must index them all within an executable otherwise the index will be incomplete. If you set a file + line breakpoint, you can't rely on the file matching the compile unit because there could be inlined functions so you would always need to index all of them. Likewise with setting a breakpoint by function name, you will need to index all DWO files.

I made a few measurements a few weeks ago and to set a file + line breakpoint we only need to parse the line table what is reasonably fast while parsing all DIEs is significantly slower. Setting a breakpoint based on function name require a full dwarf parsing, but if you use an IDE you almost never want to do it.

> Maybe we can:

> 

> - Have a new class that we hand out for a DIE, maybe named DWARFDIE that contains:

> 

>   ``` class DWARFDIE { DWARFCompileUnit *m_cu; DWARFDebugInfoEntry *m_die; }; ```

> 

>   Then change all of the places we currently use a "DWARFCompileUnit *cu, DWARFDebugInfoEntry* die" (we always pass them around together) to use a DWARFDIE instead. This allows us to store the DIEs efficiently, yet pass them around in a slightly larger container for usage. This would allow our memory usage to stay very close to where it is (an extra pointer in the compile unit).

> 

>   Then we modify DWARFCompileUnit to store the "SymbolFileDWARF *" that the compile unit comes from. We can still store the DWO file in the compile unit as well as you are already doing, we would just need to add a "SymbolFileDWARF *m_dwarf;" member variable for the non DWO case (and also for digging up the DW_TAG_compile_unit attributes that aren't in the DWO DW_TAG_compile_unit).

> 

>   Then we just make sure that all code that hands out DIEs actually hands out DWARFDIE instances instead of returning a "DWARFDebugInfoEntry *" and also having an out parameter that fills in the compile unit.

> 

>   Thoughts?

I like the idea about passing them around together especially as we already do it in a lot of case and it will have only a small overhead, but it don't help on the fact that the name to DIE indexes have to store a compile unit pointer (or a compile unit index). I am not sure how much we want to worry about the memory usage increase because I estimate it to be under 10% increase (without any measurements to prove it), but it will be significantly bigger then the effect of some changes in the Symbol class where you were quite concerned about increasing the size of it.

http://reviews.llvm.org/D12291