[PATCH] D26469: Improve DWARF parsing and attribute retrieval speed by improving DWARF abbreviation declarations.

Greg Clayton via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 9 13:39:22 PST 2016


clayborg created this revision.
clayborg added reviewers: aprantl, dblaikie, echristo, llvm-commits.
clayborg set the repository for this revision to rL LLVM.
Herald added subscribers: modocache, mgorny, mehdi_amini.

If you take a look at large DWARF files you will notice there are not that many abbreviation declarations. For a large binary that has debug inforamtion for LLVM, Clang and LLDB, there are only ~500 abbreviations. By adding some new data members to the DWARFAbbreviationDeclaration class, we can parse DWARF faster and also retrieve DWARF attribute values faster.

To speed up the DWARF parsing we do a little extra work when parsing the DWARFAbbreviationDeclaration. We determine if a DWARFAbbreviationDeclaration has a fixed byte size and we remember how to calculate the fixed byte size using a new DWARFAbbreviationDeclaration::FixedSizeInfo structure that is stored as a optional value inside the DWARFAbbreviationDeclaration. When parsing DWARF the DIEs in a compile unit, we must extract DWARFDebugInfoEntryMinimal objects. In the extract function we check this DWARFAbbreviationDeclaration::FixedAttributeSize optional value to see if the byte size if fixed and if so, we can skip all attributes in one operation instead of iterating through all of the attribute/form pairs and individually skipping each one.

To speed up attribute value extraction, we get rid of the DWARFFormValue::getFixedFormSizes(...) static function and store an optional byte size with each attribute/form pair. The old DWARFAbbreviationDeclaration::AttributeSpec was:

  class DWARFAbbreviationDeclaration {
    struct AttributeSpec {
      dwarf::Attribute Attr;
      dwarf::Form Form;
    };
  };

The new one add an optional ByteSize:

  class DWARFAbbreviationDeclaration {
  public:
    struct AttributeSpec {
      dwarf::Attribute Attr;
      dwarf::Form Form;
      Optional<uint8_t> ByteSize;
  };

This allows us to not have to calculate fixed form sizes each time we parse a DIE. Member fucntions were added to DWARFAbbreviationDeclaration, DWARFAbbreviationDeclaration::AttributeSpec and DWARFFormValue to centralize the information for each AttributeSpec and to be able to calculate the byte size given a DWARFUnit for a DWARFAbbreviationDeclaration as a whole, and if that fails, each AttributeSpec individually. We also added a map to convert dwarf::Attribute enum values into attribute indexes.

These fixes improve DWARF parsing speed by around 7 percent. The test was done by parsing an LLDB build that contains full debug info for LLDB, Clang and LLVM where we grab all compile units, extract all DIEs, traverses each DIE in the hierachy and asking each one for its name by extracting the DW_AT_name attribute (if any) and extracting the DW_AT_low_pc attribute.

Previously there we no DWARF unittests that actually tested DWARF parsing. I have added a dwarf_gen::DWARFGenerator class that allows C++ code to easily create DWARF debug info and encode it into. Example code for generating DWARF:

  using namespace dwarf_gen;
  // Create a DWARF generator object
  DWARFGenerator Dwarf;
  // Create a compile unit with the specified DWARF version and address size
  CompileUnit &CU = Dwarf.appendCompileUnit(Version, AddrSize);
  
  // Append a few attributes to the compile unit's DIE:
  CU.Die.appendAttribute({DW_AT_name, DW_FORM_strp, "/tmp/main.c"});
  CU.Die.appendAttribute({DW_AT_language, DW_FORM_data2, DW_LANG_C});
  
  // Create a DW_TAG_subprogram DIE as a child of the compile unit DIE and
  // add some attributes to it
  DIE &SubprogramDie = CU.Die.appendChild(DW_TAG_subprogram);
  SubprogramDie.appendAttribute({DW_AT_name, DW_FORM_strp, "main"});
  SubprogramDie.appendAttribute({DW_AT_low_pc, DW_FORM_addr, 0x1000U});
  SubprogramDie.appendAttribute({DW_AT_high_pc, DW_FORM_addr, 0x2000U});
  
  // Create a DW_TAG_base_type type DIE as a child of the compile unit DIE and
  // add some attributes to it
  DIE &IntDie = CU.Die.appendChild(DW_TAG_base_type);
  IntDie.appendAttribute({DW_AT_name, DW_FORM_strp, "int"});
  IntDie.appendAttribute({DW_AT_encoding, DW_FORM_data1, DW_ATE_signed});
  IntDie.appendAttribute({DW_AT_byte_size, DW_FORM_data1, 4});
  
  // Create a DW_TAG_base_type type DIE as a child of the subprogram DIE and
  // add some attributes to it.
  DIE &ArgcDie = SubprogramDie.appendChild(DW_TAG_formal_parameter);
  ArgcDie.appendAttribute({DW_AT_name, DW_FORM_strp, "argc"});
  ArgcDie.appendAttribute({DW_AT_type, DW_FORM_ref4, &IntDie});
  
  // Generate the DWARF
  DWARFSections DwarfSections;
  Dwarf.generate(DwarfSections);
  
  // Now make a DWARFContextInMemory using the given section data that was
  // generated and use LLVM's DWARF API to extract info from it.
  DWARFContextInMemory dwarfContext(
      LittleEndian, AddrSize, DwarfSections.getDebugAbbrevData(),
      DwarfSections.getDebugInfoData(), DwarfSections.getDebugStrData());
  uint32_t NumCUs = dwarfContext.getNumCompileUnits();
  DWARFCompileUnit *U = dwarfContext.getCompileUnitAtIndex(0);
  DWARFDebugInfoEntryMinimal* Die = U->getUnitDIE(false);

The DWARF generator is a separate code base from the parser and that ensures that we don't end up with symmetric encode/decode errors.

A full suite of unit tests were added that test decoding all DW_FORM_XXX values that we currently support using DWARF version 2, 3, and 4. Tests we also added for parsing a known chunk of DWARF and ensuring that we can extract it, and get the children and sibling DIEs as expected.


Repository:
  rL LLVM

https://reviews.llvm.org/D26469

Files:
  include/llvm/DebugInfo/DWARF/DWARFAbbreviationDeclaration.h
  include/llvm/DebugInfo/DWARF/DWARFContext.h
  include/llvm/DebugInfo/DWARF/DWARFDebugInfoEntry.h
  include/llvm/DebugInfo/DWARF/DWARFFormValue.h
  lib/DebugInfo/DWARF/DWARFAbbreviationDeclaration.cpp
  lib/DebugInfo/DWARF/DWARFContext.cpp
  lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
  lib/DebugInfo/DWARF/DWARFFormValue.cpp
  lib/DebugInfo/DWARF/DWARFUnit.cpp
  unittests/DebugInfo/DWARF/CMakeLists.txt
  unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
  unittests/DebugInfo/DWARF/DWARFFormValueTest.cpp
  unittests/DebugInfo/DWARF/DWARFGenerator.cpp
  unittests/DebugInfo/DWARF/DWARFGenerator.h

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26469.77387.patch
Type: text/x-patch
Size: 88945 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161109/12c6d158/attachment-0001.bin>


More information about the llvm-commits mailing list