[llvm-dev] Reducing DWARF emitter memory consumption

Fri Feb 5 15:17:31 PST 2016

Hi all,

We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, and
we've found that one of the top consumers of memory is the DWARF emitter in
lib/CodeGen/AsmPrinter/Dwarf*. I've been reading the DWARF emitter code and
I have a few ideas in mind for how to reduce its memory consumption. One
idea I've had is to restructure the emitter so that (for the most part) it
directly produces the bytes and relocations that need to go into the DWARF
sections without going through other data structures such as DIE and DIEValue.

I understand that the DWARF emitter needs to accommodate incomplete entities
that may be completed elsewhere during tree construction (e.g. abstract origins
for inlined functions, special members for types), so here's a quick high-level
sketch of the data structures that I believe could support this design:

struct DIEBlock {
  SmallVector<char, 1> Data;
  std::vector<InternalReloc> IntRelocs;
  std::vector<ExternalReloc> ExtRelocs;
  DIEBlock *Next;
};

// This would be used to represent things like DW_AT_type references to types
struct InternalReloc {
  size_t Offset; // offset within DIEBlock::Data
  DIEBlock *Target; // the offset within Target is at Data[Offset...Offset+Size]
};

// This would be used to represent things like pointers to .debug_loc/.debug_str or to functions/globals
struct ExternalReloc {
  size_t Offset; // offset within DIEBlock::Data
  MCSymbol *Target; // the offset within Target is at Data[Offset...Offset+Size]
};

struct DwarfBuilder {
  DIEBlock *First;
  DIEBlock *Cur;
  DenseMap<DISubprogram *, DIEBlock *> Subprograms;
  DenseMap<DIType *, DIEBlock *> Types;
  DwarfBuilder() : First(new DIEBlock), Cur(First) {}
  // builder implementation goes here...
};

Normally, the DwarfBuilder will just emit bytes to Cur->Data (with possibly
internal or external relocations to IntRelocs/ExtRelocs), but if it ever
needs to create a "gap" for an incomplete data structure (e.g. at the end of a
subprogram or a struct type), it will create a new DIEBlock New, store it to
Cur->Next, store Cur in a DenseMap associated with the subprogram/type/etc
and store New to Cur. To fill a gap later, the DwarfBuilder can pull the
DIEBlock out of the DenseMap and start appending there. Once the IR is fully
visited, the debug info writer will walk the linked list starting at First,
calculate a byte offset for each DIEBlock, apply any internal relocations
and write Data using the AsmPrinter (e.g. using EmitBytes, or maybe some
other new interface that also supports relocations and avoids copying).

Does that sound reasonable? Is there anything I haven't accounted for?

Thanks,
-- 
Peter

[1] https://code.google.com/p/chromium/issues/detail?id=583551#c15