[llvm-dev] [RFC] Lazy-loading of debug info metadata

Duncan P. N. Exon Smith via llvm-dev llvm-dev at lists.llvm.org
Tue Mar 22 19:28:55 PDT 2016

I have some ideas to allow the BitcodeReader to lazy-load debug info
metadata, and wanted to air this on llvm-dev before getting too deep
into the code.


Based on some analysis Mehdi ran (ping him for details), there are three
(related) compile-time bottlenecks we're seeing with `-flto=thin -g`:

 a) Reading the large number of Metadata bitcode records in the global
    metadata block.  I'm talking about raw `BitStreamer` calls here.

 b) Creating unnecessary `DI*` instances (that aren't relevant to code).

 c) Emitting unnecessary `DI*` instances (that aren't relevant to code).

Here is my recollection of some peak memory stats on a small testcase
during thin-LTO, which should be a decent indicator of (b):

  - ~150MB: DILocation
  - ~100MB: DISubprogram
  - ~70MB: DILocalVariable
  - ~50MB: (cumulative) DIType descendents

It looks, suprisingly, like types are not the primary bottleneck.

There are caveats:

  - `DISubprogram` declarations -- member function descriptors -- are
    part of the type hierarchy.
  - Most of the type hierarchy gets uniqued at parse time.
  - As a result, these data are a poor indicator for (a).

Even so, non-types are substantial.

Related work

Teresa has some post-processing in-place/in-review to avoid importing
metadata unnecessarily, but IIUC: it won't address (a) and (b), only
(c) (maybe I'm wrong?); and it only helps -flto=thin, not other

I heard a rumour that Eric has a grand plan to factor away the type
hierarchy -- awesome if true -- but I think most of this is worthwhile


Short version

 1. Serialize metadata in Function blocks where possible.
 2. Reverse the `DISubprogram`/`DICompileUnit` link.

Type-related work Eric will make unnecessary if he's fast:

 4. Remove `DICompositeType`s from `retainedTypes:`, similar to (2).
 5. Create a `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3).

Long version

 1. If a piece of metadata is referenced from only a single `Function`,
    serialize that metadata in the function's metadata block instead of
    the global metadata block.

    This addresses problems (a) and (b), primarily targeting
    `DILocation`s.  It should pick up lots of other stuff, depending on
    how much inlining has happened.

    (I have a draft of the writer side, still working on the reader.)

 2. Reverse the `DISubprogram`/`DICompileUnit` link (David and I have
    talked about this in the past in barely-related threads).  The
    direct effect is that subprograms that are not pointed at by any
    code (!dbg attachments or @llvm.dbg.value intrinsics) get dropped.

    This addresses problem (c).  If a consumer is only linking/loading a
    subset of a module's functions, this naturally filters subprograms
    to the relevant ones.  Also, with limited inlining (and assuming
    (1)), it addresses problems (a) and (b), too.

    Adrian volunteered to implement this and is apparently almost ready
    to post a patch (still working on testcase update script logic I
    believe (probably other details, don't let me oversell it)).

 3. Create a special `METADATA_SUBPROGRAM_BLOCK` for each `DISubprogram`
    in the global metadata block.  Store the relevant `DISubprogram` and
    all of the subprogram's `DILexicalBlock`s and `DILocalVariable`s.
    The block can be lazy-loaded on an all-or-nothing basis.

    In combination with (2), this addresses (a) and (b) in cases that
    (1) doesn't catch.  A lazy-loading module will only load the
    subprogram blocks that get referenced.

    (I have a basic design for this that accounts for references into
    the middle of block; I'll see what happens when I flesh it out.)

I think this will solve the non-type bottlenecks.

If Eric hasn't solved types by then, we can do similar things to the IR
for the debug info type hierarchy.

 4. Implement my proposal to remove the `DICompositeType` name map from


    Similar to (2) above, this will naturally filter the types that get
    linked in to the ones actually used by the code being linked.

    It should also allow the reader to skip records for types that have
    already been loaded in the main module.

 5. Create a special `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3) but
    for composite types and their members.  This avoids the raw bitcode
    reading overhead.  (This is totally undesigned at this point.)

More information about the llvm-dev mailing list