[llvm-dev] [ThinLTO] Reducing imported debug metadata

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 7 11:09:05 PST 2016


On Wed, Dec 7, 2016 at 11:07 AM Teresa Johnson <tejohnson at google.com> wrote:

> David and I crossed wires! Some more comments below. Thanks, Teresa
>
> On Wed, Dec 7, 2016 at 11:02 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Wed, Dec 7, 2016 at 10:36 AM Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>
> > On 2016-Dec-07, at 10:14, Teresa Johnson <tejohnson at google.com> wrote:
> >
> > A couple weeks ago I sat down with David Blaikie to figure out what we
> could trim when importing debug metadata during function importing. I have
> a prototype that reduces the resulting .o sizes significantly with ThinLTO
> -g. However, I ran into some issues when cleaning up part of this in
> preparation for a patch. Since Mehdi indicated that he and Adrian were
> going to be looking at this as well shortly, I wanted to send out a summary
> of what David Blaikie and I discussed, what I have right now, and the
> issues I am hitting along with possible solutions.
> >
> > There was a related older patch where I did some of the same things
> (D16440: [ThinLTO] Link in only necessary DICompileUnit operands), but that
> was made obsolete by a number of changes, such as the reversal of
> GlobalVariable/DIGlobalVariable edges, and a number of debug metadata
> changes done by Duncan (and Adrian I think?). So I started from scratch
> this time.
> >
> > Here's what I came up with after discussing what to import with David.
> Specifically, how to handle fields when importing a DICompileUnit:
>
> Do you have a profile of which (if any) of these is causing the debug info
> bloat?
>
> I worry/suspect that much of the bloat could be in the DILocation chains
> (line-table), which are generally non-mergeable.
>
> > a) List of enum types
> >  - Not needed in the importing module: change to nullptr
>
> Seems reasonable for -flto=thin (incorrect for -flto=full, though).
>
> > b) List of macros
> >  - Not needed in the importing module: change to nullptr
>
> We don't have macros by default, right?  You need some -gmacros flag I
> hope?  I remember objecting to the massive debug info bloat when they were
> introduced, but was convinced by an argument along the lines of
> "it's-never-going-to-be-on-by-default".
>
>
> Yes, it's off by default, but still good to think about whether we could
> do something similar for it than other things here.
>
>
>
> > c) List of global variables
> > - Only needed if we have imported the corresponding global variable
> definition. Since we currently don't import any global variable definitions
> (we should assert that there aren't any in the import list so this doesn't
> get stale), we can change this to nullptr.
>
> Why does this still exist?  Can't we delete this list in all modes, since
> GlobalValues reference their own DIGlobalVariables directly?
>
>
> Yeah, I forget why - I think there's a reason, though.
>
>
>
> > d) List of imported entities
> > - For now need to import those that have a DILocalScope (and drop others
> from the imported entities list on the imported CU). David had some ideas
> on restructuring the way these are referenced, but for now we
> conservatively need to keep those that may be from functions, any that are
> from functions not actually imported will not be emitted into the object
> file anyway.
>
> I'd rather just restructure these if we can.
>
>
> In the simplest form, it probably means adding a new list to DISubprograms
> (of all the imported entities in that function) - that way at least the
> function local imported entities would be dropped naturally when the
> subprogram was dropped (or not imported).
>
>
>
> > e) List of retained types
> > - Leave as-is but import DICompositeTypes as type declarations. This one
> David thought would need to be under an option because of lldb's assumption
> that the DWARF match the Clang AST (I think I am stating that right,
> correct me if not!).
>
> How many of these exist?  I expect this list to be almost always empty.
> If it's not, why not?  (Did I fail to push the commit to Clang that stopped
> adding all C++ types here??)  Can we somehow prune the list?
>
>
> Ah, you're right - I was probably operating under the old model that
> everything was in retained types. It's good to know/see that's not the
> case. I wonder what, if anything, we put in retained types these days.
> (there certainly used to be ways - like cases where a type is used without
> any variable: "((foo*)void_ptr)->x = 3;" - we could probably help in those
> cases by pushing a retained types list down onto subprograms too, so the
> uses were attached to subprograms - but that means potentially a lot of
> duplication (if a type is 'retained' in many functions))
>
>
> Actually, now that I think about it, it isn't just the composite types
> hanging off the retained types list that are an issue. When I first just
> tried mapping the retained types list to nullptr on the imported CU, it had
> a much smaller impact on the sizes. It is the composite types referenced
> elsewhere that get pulled in anyway that are still causing bloat (I think
> from the imported functions' DISubprograms via the type lists reached from
> there).
>
>
>
> I'm concerned that this is not a particularly fast thing to do.
>
> I'm also worried that this would cause major debug info quality
> degradation on Darwin (defer to Adrian on that...).  I'd rather leave it
> unoptimized unless we're sure we need to do something here.
>
>
> Turning definitions into declarations like that is incompatible with lldb
> - it'd need to be turned off (Teresa mentioned putting it under a flag for
> this reason) for any lldb platform.
>
>
> Also, why do you need this for -flto=thin?  Why not just nullptr?
>
>
> Legit point - same as enums, etc, they'll be emitted somewhere, no need to
> retain them everywhere, even as declarations.
>
>
> Ok, simple enough to also get this mapped in as nullptr on the imported CU.
>
>
>
> The proper definition->declaration demotion would have to happen by
> walking all the debug info looking for types... :/
>
>
> Right, that's what I did to get the big benefit - using the
> DebugInfoFinder.
>

Ah, OK - yep, that makes more sense. Thanks for clarifying/mentioning!

- Dave


>
> Teresa
>
>
>
>
>
> > Implementation:
>
> Generally, I'd rather see the direction of explicitly building new
> DICompileUnits and linking in the correct/necessary arguments, like the
> IRLinker explicitly builds new llvm::Functions.  Ideally, we'd only have to
> explicitly deal with DICompileUnit (one reason that I'm concerned about (e)
> above), which we can find through !llvm.dbg.cu.
>
> > a)-d) (enums/macros/global variables/imported entities):
> >
> > For a-d I have a simple solution in the IRLinker. At the very start of
> IRLinking in a module, if it is being linked in for function importing, I
> invoke a function that does the handling (i.e. from the IRLinker
> constructor).
> >
> > This routine handles a-c by pre-populating the ValueMap's MDMap entry
> for those Metadata* (e.g. the RawEnumTypes Metadata*) with nullptr, so
> metadata mapping automatically makes them nullptr on the imported
> DICompileUnit.
> >
> > For d, this routine walks the imported entities on the source CU and
> builds a SmallVector of those with local scopes. It then invokes
> replaceImportedEntities on the source CU to replace it with the new list
> (essentially dropping those with non-local scopes), and again the
> subsequent metadata mapping just works.
> >
> > e) (retained types):
> >
> > For e, changing the imported DICompisiteType to type declarations, I am
> having some issues. I prototyped this by doing it in the BitcodeReader. I
> passed in a new flag indicating that we are parsing a module for function
> importing. Then when parsing a METADATA_COMPOSITE_TYPE, if we are importing
> a module I change the calls to buildODRType and GET_OR_DISTINCT to instead
> create a type declaration:
> >  - pass in flags Flags|DINode::FlagFwdDecl
> >  - pass in nullptr for BaseType, Elements, VtableHolder, TemplateParams
> >  - pass in 0 for OffsetInBits
> >
> > Note that buildODRType will not mutate any existing DICompisiteType for
> that identifier found in the DITypeMap to a type declaration (FlagFwdDecl)
> though. This is good since we are sharing the DITypeMap with the original
> (destination) module, and we don't want to change any type definitions on
> the existing destination (importing) module to be type declarations when
> the same type is also in the source module we are about to import.
> >
> > When cleaning this up, I initially tried mutating the types on the
> source module at the start of IRLinking of imported modules. I did this by
> adding a new DICompositeType function to force convert an existing type
> definition to a type declaration (since the bitcode reader already would
> have created a type definition when parsing the imported source module, and
> as I noted above it wasn't allowed to be mutated to a declaration after
> that point). However, this is wrong since it ended up changing type
> definitions that were also used by the original destination/importing
> module to type declarations if they were also used in the source module
> (and therefore shared a DITypeMap entry). To get the forced mutation to a
> type decl to work here, I would somehow need to detect when a composite
> type on the source module was *not* also used by the original destination
> module. The only way I came up with off the top of my head was to also do a
> DebugInfoFinder::processModule on the dest module, and subtract the
> resulting types() from those I found when finding types on the source
> module I'm about to import, and only force-convert the remaining ones to
> type decls.
> >
> >
> > Interested in feedback on any of the above, and in particular on the
> cleanest way to create type declarations for imported types.
> >
> > Thanks!
> > Teresa
> >
> > --
> > Teresa Johnson |       Software Engineer |     tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/7b36709f/attachment.html>


More information about the llvm-dev mailing list