[lldb-dev] RFC for DWZ = DW_TAG_imported_unit + DWARF-5 supplementary files

Jan Kratochvil via lldb-dev lldb-dev at lists.llvm.org
Thu Aug 24 08:38:43 PDT 2017


On Wed, 23 Aug 2017 23:55:00 +0200, Greg Clayton wrote:
> > On Aug 23, 2017, at 2:06 PM, Jan Kratochvil via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> >  Currently it always has to as on non-OSX platforms it is using
> > DWARFCompileUnit::Index(). But as I plan to implement DWARF-5 .debug_names
> > index (like __apple_* index) maybe LLDB then no longer needs to populate
> > m_die_array and so just expanding all DW_TAG_partial_unit into a single
> > m_die_array for each DW_TAG_compile_unit is fine?
> 
> So I glossed over the documentation and I gathered that DWARF type info
> might be stored in other DWARF files and references from the current file.

Yes. Arbitrary DIEs, not just type-defining DIEs. DW_TAG_imported_unit can
happen anywhere according to the DWARF standard but for the DWZ/Fedora case it
is a bit simpler one can limit the support for only DW_TAG_imported_unit with
parent of DW_TAG_compile_unit (or DW_TAG_partial_unit for nested imports).
DWZ ever uses DW_TAG_imported_unit only at the CU/PU top level.


> SymbolFileDWARFDebugMap is an example of how we do things on MacOS. We have
> one clang::ASTContext in the SymbolFileDWARFDebugMap, and multiple external
> .o files (where each ins a SymbolFileDWARF instance) that contain unlinked
> DWARF. Each SymbolFileDWARF instance will have:
> 
>   void SymbolFileDWARF::SetDebugMapModule(const lldb::ModuleSP &module_sp);
> 
> called to indicate it is actually part of the SymbolFileDWARFDebugMap.
> Then there are functions that check the debug map file and return the
> UniqueDWARFASTTypeMap or the TypeSystem from the SymbolFileDWARFDebugMap if
> we have one:

Therefore IIUC you make a single context for all types from a compiled
program?  Primitive (non-class) types can be different across CUs:
	CU1: typedef int  foo_t;
	CU2: typedef long foo_t;

I have a problem that one DW_TAG_partial_unit can be included by multiple
DW_TAG_compile_unit.  Therefore DWARFCompileUnit for DW_TAG_partial_unit
itself cannot map itself by SymbolFileDWARFDebugMap to its parent
DWARFCompileUnit for DW_TAG_compile_unit (as it has multiple parents).

I expect you cannot link a single MacOS object file into two diferent
programs/libraries which are debugged at once by LLDB.


> One other idea is to keep all DWARF files separate and stand alone. Your
> main DWARF file with one or more DW_TAG_imported_unit and all
> DW_TAG_imported_unit referenced files, each as its own SymbolFileDWARF. Any
> reference to a DW_FORM_ref_alt would turn into a forward declaration in the
> current SymbolFileDWARF, so the ASTContext in each SymbolFileDWARF wouldn't
> know anything about the types,

Is it applicable even if DW_TAG_imported_unit points to DW_TAG_partial_unit's
containing DW_TAG_variable, DW_TAG_subprogram and arbitrary other DIEs, not
just types?


> The other approach I might suggest is to write a DWARF linker, maybe using
> LLVM's DWARF classes (see llvm-dsymutil sources) that takes the top level
> DWARF and all DW_TAG_imported_unit files and combines them all back into one
> large DWARF file.

That would defeat the advantage of DWZ-optimized debug info files.  DWZ
reduces their size by approx. 30%.  If the relinking was made on-demand then
it defeats the LLDB performance advantage over GDB - GDB can already read DWZ
natively (GDB does duplicates the internal representation when reading DWZ
CUs/PUs).  In such case I could already code such reconstruction of non-DWZ
debug info when reading m_die_array - but that also seems needlessly slow to
me.  DWZ should bring even performance advantage of parsing the DWZ common
file (=imported file) only once.


> One other idea is to let each DWARF file be separate, and when you need
> a type from a DW_TAG_imported_unit you log that file as stand alone and copy
> the type from its clang::ASTContext into the main SymbolFileDWARF's AST
> context. We copy types all the time in expressions as each on has its own
> AST context.

Does this work even for non-type DIEs?


> So there are many solutions. I would vote for linking the DWARF into
> a single file much like we do with llvm-dsymutil on Mac, but that really
> depends if the type uniquing is desired within a single DWARF file and not
> across many shared libraries that all reference common DW_TAG_imported_unit
> files.

I agree the DWZ optimization does primarily what -fdebug-types-section does.

I do not see why to really do relinking files on disk, debugger should not
need that to read the debug info.

I do not fully understand why you do the llvm-dsymutil relinking when you
already have SymbolFileDWARFDebugMap in LLDB.  But I do not know OSX/Mac.


Thanks,
Jan


More information about the lldb-dev mailing list