[llvm-dev] [RFC] Lazy-loading of debug info metadata
Eric Christopher via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 22 20:11:25 PDT 2016
On Tue, Mar 22, 2016 at 8:04 PM David Blaikie <dblaikie at gmail.com> wrote:
> +pcc, who had some other ideas/patch out for improving memory usage of
> debug info
> +Reid, who's responsible for the windows/CodeView/PDB debug info which is
> motivating some of the ideas about changes to type emission
>
>
So I discussed this with Adrian and Mehdi at the social last Thursday and
I'm getting set to finish the write up. I think it'll have some bearing on
this proposal as I think it'll change how we want to take a look at the
format of DISubprogram metadata a bit more. That said, most of it is
orthogonal to the changes Duncan is talking about here. Just puts the
pressure on to get the other proposal written up.
-eric
> So how does this relate, or not, to Peter's (pcc) work trying to reduce
> the DIE overhead during code gen? Are you folks chasing different memory
> bottlenecks? Are they both relevant (perhaps in different scenarios)?
>
> Baking into the IR more about types as units has pretty direct overlap
> with Reid/CodeView/etc - so, yeah, that'll takes ome discussion (but, as
> you say, it's not in your immediate plan anyway, so we can come back to
> that - but would be good for whoever gets there first to discuss it with
> the others)
>
> On Tue, Mar 22, 2016 at 7:28 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>> I have some ideas to allow the BitcodeReader to lazy-load debug info
>> metadata, and wanted to air this on llvm-dev before getting too deep
>> into the code.
>>
>> Motivation
>> ==========
>>
>> Based on some analysis Mehdi ran (ping him for details), there are three
>> (related) compile-time bottlenecks we're seeing with `-flto=thin -g`:
>>
>> a) Reading the large number of Metadata bitcode records in the global
>> metadata block. I'm talking about raw `BitStreamer` calls here.
>>
>> b) Creating unnecessary `DI*` instances (that aren't relevant to code).
>>
>> c) Emitting unnecessary `DI*` instances (that aren't relevant to code).
>>
>> Here is my recollection of some peak memory stats on a small testcase
>> during thin-LTO, which should be a decent indicator of (b):
>>
>> - ~150MB: DILocation
>> - ~100MB: DISubprogram
>> - ~70MB: DILocalVariable
>> - ~50MB: (cumulative) DIType descendents
>>
>> It looks, suprisingly, like types are not the primary bottleneck.
>>
>> There are caveats:
>>
>> - `DISubprogram` declarations -- member function descriptors -- are
>> part of the type hierarchy.
>> - Most of the type hierarchy gets uniqued at parse time.
>> - As a result, these data are a poor indicator for (a).
>>
>> Even so, non-types are substantial.
>>
>> Related work
>> ============
>>
>> Teresa has some post-processing in-place/in-review to avoid importing
>> metadata unnecessarily, but IIUC: it won't address (a) and (b), only
>> (c) (maybe I'm wrong?); and it only helps -flto=thin, not other
>> lazy-loaders.
>>
>> I heard a rumour that Eric has a grand plan to factor away the type
>> hierarchy -- awesome if true -- but I think most of this is worthwhile
>> regardless.
>>
>> Proposal
>> ========
>>
>> Short version
>> -------------
>>
>> 1. Serialize metadata in Function blocks where possible.
>> 2. Reverse the `DISubprogram`/`DICompileUnit` link.
>> 3. Create a `METADATA_SUBPROGRAM_BLOCK`.
>>
>> Type-related work Eric will make unnecessary if he's fast:
>>
>> 4. Remove `DICompositeType`s from `retainedTypes:`, similar to (2).
>> 5. Create a `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3).
>>
>> Long version
>> ------------
>>
>> 1. If a piece of metadata is referenced from only a single `Function`,
>> serialize that metadata in the function's metadata block instead of
>> the global metadata block.
>>
>> This addresses problems (a) and (b), primarily targeting
>> `DILocation`s. It should pick up lots of other stuff, depending on
>> how much inlining has happened.
>>
>> (I have a draft of the writer side, still working on the reader.)
>>
>> 2. Reverse the `DISubprogram`/`DICompileUnit` link (David and I have
>> talked about this in the past in barely-related threads). The
>> direct effect is that subprograms that are not pointed at by any
>> code (!dbg attachments or @llvm.dbg.value intrinsics) get dropped.
>>
>> This addresses problem (c). If a consumer is only linking/loading a
>> subset of a module's functions, this naturally filters subprograms
>> to the relevant ones. Also, with limited inlining (and assuming
>> (1)), it addresses problems (a) and (b), too.
>>
>> Adrian volunteered to implement this and is apparently almost ready
>> to post a patch (still working on testcase update script logic I
>> believe (probably other details, don't let me oversell it)).
>>
>> 3. Create a special `METADATA_SUBPROGRAM_BLOCK` for each `DISubprogram`
>> in the global metadata block. Store the relevant `DISubprogram` and
>> all of the subprogram's `DILexicalBlock`s and `DILocalVariable`s.
>> The block can be lazy-loaded on an all-or-nothing basis.
>>
>> In combination with (2), this addresses (a) and (b) in cases that
>> (1) doesn't catch. A lazy-loading module will only load the
>> subprogram blocks that get referenced.
>>
>> (I have a basic design for this that accounts for references into
>> the middle of block; I'll see what happens when I flesh it out.)
>>
>> I think this will solve the non-type bottlenecks.
>>
>> If Eric hasn't solved types by then, we can do similar things to the IR
>> for the debug info type hierarchy.
>>
>> 4. Implement my proposal to remove the `DICompositeType` name map from
>> `retainedTypes:`.
>>
>>
>> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160125/327936.html
>>
>> Similar to (2) above, this will naturally filter the types that get
>> linked in to the ones actually used by the code being linked.
>>
>> It should also allow the reader to skip records for types that have
>> already been loaded in the main module.
>>
>> 5. Create a special `METADATA_COMPOSITE_TYPE_BLOCK`, similar to (3) but
>> for composite types and their members. This avoids the raw bitcode
>> reading overhead. (This is totally undesigned at this point.)
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/8951a345/attachment.html>
More information about the llvm-dev
mailing list