[llvm-dev] [RFC] Lazy-loading of debug info metadata

Tue Mar 29 19:11:31 PDT 2016

I have no objections to any of this FWIW :)

-eric

On Tue, Mar 29, 2016 at 6:46 PM Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2016-Mar-22, at 20:11, Eric Christopher <echristo at gmail.com> wrote:
> >
> >> On Tue, Mar 22, 2016 at 8:04 PM David Blaikie <dblaikie at gmail.com>
> wrote:
> >> +pcc, who had some other ideas/patch out for improving memory usage of
> debug info
> >> +Reid, who's responsible for the windows/CodeView/PDB debug info which
> is motivating some of the ideas about changes to type emission
> >
> > So I discussed this with Adrian and Mehdi at the social last Thursday
> and I'm getting set to finish the write up. I think it'll have some bearing
> on this proposal as I think it'll change how we want to take a look at the
> format of DISubprogram metadata a bit more.
>
> (The interesting bit here is to make a clearer split between
> DISubprogram declarations (part of the type hierarchY) and
> DISubprogram definitions (part of the code/line table/variable
> locations).  I think that'll end up being mostly orthogonal to what
> I'm trying to do.)
>
> > That said, most of it is orthogonal to the changes Duncan is talking
> about here. Just puts the pressure on to get the other proposal written up.
>
> Which is now here:
> http://lists.llvm.org/pipermail/llvm-dev/2016-March/097773.html
>
> >> Baking into the IR more about types as units has pretty direct overlap
> with Reid/CodeView/etc - so, yeah, that'll takes ome discussion (but, as
> you say, it's not in your immediate plan anyway, so we can come back to
> that - but would be good for whoever gets there first to discuss it with
> the others)
>
> After thinking for a few days, I don't think this will bake anything
> new into the IR.  If anything it removes a few special cases.
>
> More at the bottom.
>
> >>> Motivation
> >>> ==========
> >>>
> >>> Based on some analysis Mehdi ran (ping him for details), there are
> three
> >>> (related) compile-time bottlenecks we're seeing with `-flto=thin -g`:
> >>>
> >>>  a) Reading the large number of Metadata bitcode records in the global
> >>>     metadata block.  I'm talking about raw `BitStreamer` calls here.
> >>>
> >>>  b) Creating unnecessary `DI*` instances (that aren't relevant to
> code).
> >>>
> >>>  c) Emitting unnecessary `DI*` instances (that aren't relevant to
> code).
> >>>
> >>> Here is my recollection of some peak memory stats on a small testcase
> >>> during thin-LTO, which should be a decent indicator of (b):
> >>>
> >>>   - ~150MB: DILocation
> >>>   - ~100MB: DISubprogram
> >>>   - ~70MB: DILocalVariable
> >>>   - ~50MB: (cumulative) DIType descendents
> >>>
> >>> It looks, suprisingly, like types are not the primary bottleneck.
>
> (Probably wrong for (a), BTW.  Caveats matter.)
>
> >>> There are caveats:
> >>>
> >>>   - `DISubprogram` declarations -- member function descriptors -- are
> >>>     part of the type hierarchy.
> >>>   - Most of the type hierarchy gets uniqued at parse time.
> >>>   - As a result, these data are a poor indicator for (a).
>
> ((a) is the main bottleneck for compile-time of -flto=thin (since it's
> quadratic in the number of files).  (b) only affects memory.  Also
> important, but at least it scales linearly.)
>
> >>> Even so, non-types are substantial.
> >>>
> >>> Proposal
> >>> ========
> >>>
> >>> Short version
> >>> -------------
> >>>
> >>>  4. Remove `DICompositeType`s from `retainedTypes:`, similar to (2).
>
> This is the part that's relevant to the new RFC Eric just posted.
>
> >>> Long version
> >>> -------------
> >>>
> >>>  4. Implement my proposal to remove the `DICompositeType` name map from
> >>>     `retainedTypes:`.
> >>>
> >>>
> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160125/327936.html
> >>>
> >>>     Similar to (2) above, this will naturally filter the types that get
> >>>     linked in to the ones actually used by the code being linked.
> >>>
> >>>     It should also allow the reader to skip records for types that have
> >>>     already been loaded in the main module.
>
> The essential things I want to accomplish with this part:
>
>   - Make `type:` operands less special: instead of referencing types
>     indirectly through MDString, point directly at the type node.
>
>   - Stop using `retainedTypes:` by default (only for -gfull, etc.).
>
>   - Avoid blowing up memory in -flto=full (which converting MDString
>     references back to pointers would do naively, through
>     re-introducing cycles).  Note that this needs to be handled
>     somehow at BitcodeReader time.
>
> After chatting with Eric, I don't think this conflicts at all with the
> other RFC.  Unifying the `type:` operands might actually help both.
>
> One good point David mentioned last week was that we don't want to
> teach the IR any more about types.  Rather than inventing some new
> context (as I suggested originally), I figure LTO plugins can just
> pass a (StringRef -> DIType*) map to the BitcodeReader, and the Module
> itself doesn't need to know anything about it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160330/ea8eb645/attachment.html>