[lldb-dev] RFC: DWZ = DW_TAG_imported_unit + separate DWARF common file

Jan Kratochvil via lldb-dev lldb-dev at lists.llvm.org
Wed Oct 30 04:33:44 PDT 2019


I would like to get a design approval before I do all remaining technical
cleanups of (3) below.

It was discussed before at:
	[lldb-dev] RFC for DWZ = DW_TAG_imported_unit + DWARF-5 supplementary files
	Message-ID: <20170823210611.GA8704 at host1.jankratochvil.net>

DWZ principle: DW_TAG_compile_unit can use DW_TAG_imported_unit to "include"
some other DW_TAG_partial_unit. This can happen transitively==recursively.
Unfortunately DW_TAG_partial_unit does not contain DW_AT_language so for
GetLanguageType() LLDB needs to read DW_TAG_compile_unit which included it
and read its DW_AT_language there. Two DW_TAG_compile_unit's with different
DW_AT_language can include (=DW_TAG_imported_unit) the same DW_TAG_partial_unit.

(1) Originally I tried to create 1:1 DWARFCompileUnit mapping to
    DW_TAG_partial_unit. This did not work as each DWARFCompileUnit gets
    mapped to CompileUnit and multiple CompileUnit's in the same block lead to:
        Assertion `!Parent || &Parent->getParentASTContext() == &Ctx' failed.

(2) Then I tried creating N:1 DWARFCompileUnit's for each DW_TAG_partial_unit,
    that is a 1:1 DWARFCompileUnit for each DW_TAG_imported_unit "inclusion".
    That worked perfectly but it was considered needlessly ineffective:
      [Lldb-commits] [PATCH] D45170: Cleanup DWARFCompileUnit and DWARFUnit in preparation for adding DWARFTypeUnit
    It was mapping each DW_TAG_partial_unit from its DWARFFileOffset to
    a new virtual dw_offset_t by MainCU_FileOffsetToCU():
    It needed to remap dw_offset_t back to DWARFFileOffset to read
    DWARFUnit::m_die_array without a data copy for each DW_TAG_imported_unit.

My current patch proposal is to create 1:1 DWARFPartialUnit (*1) mapping to
DW_TAG_partial_unit. To support GetLanguageType() for DWARFPartialUnit one
must track the original DWARFCompileUnit * (DW_TAG_compile_unit) in any
DIERef or DWARFBaseDIE. DWARFPartialUnit itself can be used by multiple
DWARFCompileUnit's. I have replaced 'DWARFUnit *' at many places by:
    class DWARFUnitPair { DWARFUnit *m_cu; DWARFCompileUnit *m_main_cu; };

(*1) There is no DWARFPartialUnit, DW_TAG_partial_unit is represented by
DWARFCompileUnit. During scanning of CUs it is too expensive to read the
first DIE whether it is DW_TAG_compile_unit or DW_TAG_partial_unit.
DWARFPartialUnit would make sense with DWARF-5 DWZ DW_UT_partial but the
current only existing DWZ tool does not support DWARF-5 yet.

Therefore any AST contexts are made only for DW_TAG_compile_unit,
DW_TAG_partial_unit never has its own AST context. Also any
DW_TAG_partial_unit DIEs are duplicated as AST for each use of
DW_TAG_imported_unit. I hope I did not miss some such more optimal possibility
but I prefer to decompress DWZ on DWARF level than on AST level as the DWZ
tool also does not interpret much the DWARF data when finding common DWARF

Stays the same for as once instance for DW_TAG_partial_unit:
  DWARFDebugInfoEntry: no DWARFUnit reference, kept intact
    sizeof(): stays 16
Gets duplicated for each DW_TAG_imported_unit:
  DWARFBaseDIE: DWARFUnit *m_cu, DWARFDebugInfoEntry *m_die,
                now added also: DWARFCompileUnit *m_main_cu (in DWARFUnitPair)
    sizeof(): 16 -> 24 (new DWARFCompileUnit *m_main_cu)
    Only DWARFTypeUnit DIEs must have DWARFCompileUnit *m_main_cu as nullptr,
      they contain DW_AT_language so m_main_cu is not really needed for them.
  DIERef sizeof(): stays 8 (it now remembers parent DWARFCompileUnit's index)
Only for temporary variables:
  new DWARFSimpleDIE: DWARFUnit *m_cu, DWARFDebugInfoEntry *m_die
   - it is like original DWARFBaseDIE for a few cases (like Reference())
     where I need DWARFUnit *m_cu but DWARFCompileUnit *m_main_cu is not known.
   - DWARFSimpleDIE is now standalone, maybe DWARFBaseDIE could inherit it?

Some functions needing parent DWARFCompileUnit *m_main_cu were moved from
DWARFUnit to DWARFCompileUnit: LookupAddress, AppendDIEsWithTag,
GetTypeSystem, GetLanguageType, GetUnitDIEOnly, DIE, GetDIE,

  git remote add jankratochvil git://git.jankratochvil.net/lldb
  git fetch jankratochvil
  git checkout jankratochvil/dwz
  git clone -b dwz git://git.jankratochvil.net/lldb
Full patch:
Patch part showing the usage of DWARFUnitPair:


More information about the lldb-dev mailing list