[llvm-dev] [RFC] Lazy-loading of debug info metadata

Teresa Johnson via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 23 07:17:36 PDT 2016


On Tue, Mar 22, 2016 at 7:28 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

> I have some ideas to allow the BitcodeReader to lazy-load debug info
> metadata, and wanted to air this on llvm-dev before getting too deep
> into the code.
>
> Motivation
> ==========
>
> Based on some analysis Mehdi ran (ping him for details), there are three
> (related) compile-time bottlenecks we're seeing with `-flto=thin -g`:
>
>  a) Reading the large number of Metadata bitcode records in the global
>     metadata block.  I'm talking about raw `BitStreamer` calls here.
>
>  b) Creating unnecessary `DI*` instances (that aren't relevant to code).
>

Creating in the source module, or in the dest module during linking?


>
>  c) Emitting unnecessary `DI*` instances (that aren't relevant to code).
>
> Here is my recollection of some peak memory stats on a small testcase
> during thin-LTO, which should be a decent indicator of (b):
>
>   - ~150MB: DILocation
>   - ~100MB: DISubprogram
>   - ~70MB: DILocalVariable
>   - ~50MB: (cumulative) DIType descendents
>
> It looks, suprisingly, like types are not the primary bottleneck.
>
> There are caveats:
>
>   - `DISubprogram` declarations -- member function descriptors -- are
>     part of the type hierarchy.
>   - Most of the type hierarchy gets uniqued at parse time.
>   - As a result, these data are a poor indicator for (a).
>
> Even so, non-types are substantial.
>
> Related work
> ============
>
> Teresa has some post-processing in-place/in-review to avoid importing
> metadata unnecessarily, but IIUC: it won't address (a) and (b), only
> (c) (maybe I'm wrong?); and it only helps -flto=thin, not other
> lazy-loaders.
>

That is D16440. It reduces the metadata imported into the dest module (not
sure whether that falls into (b) or just (c)).

It could actually help full LTO as well since I also added support for not
linking in unneeded DISubprogram for full LTO at the same time as ThinLTO
in r256003. But right now the changes in the patch are guarded so they only
happen under ThinLTO since some of the other things we prune from the
imported DICompileUnit only applies to ThinLTO. I could restructure that a
bit to get the reduced retained types importing to occur for full LTO as
well.


> I heard a rumour that Eric has a grand plan to factor away the type
> hierarchy -- awesome if true -- but I think most of this is worthwhile
> regardless.
>
> Proposal
> ========
>
> Short version
> -------------
>
>  1. Serialize metadata in Function blocks where possible.
>  2. Reverse the `DISubprogram`/`DICompileUnit` link.
>  3. Create a `METADATA_SUBPROGRAM_BLOCK`.
>
> Type-related work Eric will make unnecessary if he's fast:
>
>  4. Remove `DICompositeType`s from `retainedTypes:`, similar to (2).
>  5. Create a `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3).
>
> Long version
> ------------
>
>  1. If a piece of metadata is referenced from only a single `Function`,
>     serialize that metadata in the function's metadata block instead of
>     the global metadata block.
>
>     This addresses problems (a) and (b), primarily targeting
>     `DILocation`s.  It should pick up lots of other stuff, depending on
>     how much inlining has happened.
>
>     (I have a draft of the writer side, still working on the reader.)
>
>  2. Reverse the `DISubprogram`/`DICompileUnit` link (David and I have
>     talked about this in the past in barely-related threads).  The
>     direct effect is that subprograms that are not pointed at by any
>     code (!dbg attachments or @llvm.dbg.value intrinsics) get dropped.
>
>     This addresses problem (c).  If a consumer is only linking/loading a
>     subset of a module's functions, this naturally filters subprograms
>     to the relevant ones.  Also, with limited inlining (and assuming
>     (1)), it addresses problems (a) and (b), too.
>
>     Adrian volunteered to implement this and is apparently almost ready
>     to post a patch (still working on testcase update script logic I
>     believe (probably other details, don't let me oversell it)).
>

As noted in the review thread for my D16440, I'll need to adjust that
handling once this link reversal goes in.


>  3. Create a special `METADATA_SUBPROGRAM_BLOCK` for each `DISubprogram`
>     in the global metadata block.  Store the relevant `DISubprogram` and
>     all of the subprogram's `DILexicalBlock`s and `DILocalVariable`s.
>     The block can be lazy-loaded on an all-or-nothing basis.
>
>     In combination with (2), this addresses (a) and (b) in cases that
>     (1) doesn't catch.  A lazy-loading module will only load the
>     subprogram blocks that get referenced.
>

I'm not sure I understand this part - if the debug info for each subprogram
can be divided into separate blocks, why can't it be moved into the
function's metadata block? I.e. what happens for debug metadata that is
referenced by multiple functions, which I thought was all that was going to
remain in the global metadata block? Oh - the DISubprogram may be
referenced in other places within the global metadata so cannot move into
the function metadata block. So debug metadata only reached from that
DISubprogram is included in its block, but any debug metadata referenced by
multiple DISubprograms would not be located within one of these blocks?


>     (I have a basic design for this that accounts for references into
>     the middle of block; I'll see what happens when I flesh it out.)
>
> I think this will solve the non-type bottlenecks.
>
> If Eric hasn't solved types by then, we can do similar things to the IR
> for the debug info type hierarchy.
>
>  4. Implement my proposal to remove the `DICompositeType` name map from
>     `retainedTypes:`.
>
>
> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160125/327936.html
>
>     Similar to (2) above, this will naturally filter the types that get
>     linked in to the ones actually used by the code being linked.
>
>     It should also allow the reader to skip records for types that have
>     already been loaded in the main module.
>

The ValueMapper or something will need to figure out which types referenced
by UUID to map/link in to the dest module. Currently the ValueMapper does
not follow UUID references, but these are brought in when the DICompileUnit
is mapped since they are all in the retained types list.


>
>  5. Create a special `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3) but
>     for composite types and their members.  This avoids the raw bitcode
>     reading overhead.  (This is totally undesigned at this point.)
>

Ditto here - any metadata referenced by multiple composite types does not
go into a block, right?

Thanks,
Teresa


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/f2aa6068/attachment.html>


More information about the llvm-dev mailing list