[llvm-dev] [ThinLTO] Reducing imported debug metadata
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Wed Dec 7 11:02:05 PST 2016
On Wed, Dec 7, 2016 at 10:36 AM Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
>
> > On 2016-Dec-07, at 10:14, Teresa Johnson <tejohnson at google.com> wrote:
> >
> > A couple weeks ago I sat down with David Blaikie to figure out what we
> could trim when importing debug metadata during function importing. I have
> a prototype that reduces the resulting .o sizes significantly with ThinLTO
> -g. However, I ran into some issues when cleaning up part of this in
> preparation for a patch. Since Mehdi indicated that he and Adrian were
> going to be looking at this as well shortly, I wanted to send out a summary
> of what David Blaikie and I discussed, what I have right now, and the
> issues I am hitting along with possible solutions.
> >
> > There was a related older patch where I did some of the same things
> (D16440: [ThinLTO] Link in only necessary DICompileUnit operands), but that
> was made obsolete by a number of changes, such as the reversal of
> GlobalVariable/DIGlobalVariable edges, and a number of debug metadata
> changes done by Duncan (and Adrian I think?). So I started from scratch
> this time.
> >
> > Here's what I came up with after discussing what to import with David.
> Specifically, how to handle fields when importing a DICompileUnit:
>
> Do you have a profile of which (if any) of these is causing the debug info
> bloat?
>
> I worry/suspect that much of the bloat could be in the DILocation chains
> (line-table), which are generally non-mergeable.
>
> > a) List of enum types
> > - Not needed in the importing module: change to nullptr
>
> Seems reasonable for -flto=thin (incorrect for -flto=full, though).
>
> > b) List of macros
> > - Not needed in the importing module: change to nullptr
>
> We don't have macros by default, right? You need some -gmacros flag I
> hope? I remember objecting to the massive debug info bloat when they were
> introduced, but was convinced by an argument along the lines of
> "it's-never-going-to-be-on-by-default".
>
Yes, it's off by default, but still good to think about whether we could do
something similar for it than other things here.
>
> > c) List of global variables
> > - Only needed if we have imported the corresponding global variable
> definition. Since we currently don't import any global variable definitions
> (we should assert that there aren't any in the import list so this doesn't
> get stale), we can change this to nullptr.
>
> Why does this still exist? Can't we delete this list in all modes, since
> GlobalValues reference their own DIGlobalVariables directly?
>
Yeah, I forget why - I think there's a reason, though.
>
> > d) List of imported entities
> > - For now need to import those that have a DILocalScope (and drop others
> from the imported entities list on the imported CU). David had some ideas
> on restructuring the way these are referenced, but for now we
> conservatively need to keep those that may be from functions, any that are
> from functions not actually imported will not be emitted into the object
> file anyway.
>
> I'd rather just restructure these if we can.
>
In the simplest form, it probably means adding a new list to DISubprograms
(of all the imported entities in that function) - that way at least the
function local imported entities would be dropped naturally when the
subprogram was dropped (or not imported).
>
> > e) List of retained types
> > - Leave as-is but import DICompositeTypes as type declarations. This one
> David thought would need to be under an option because of lldb's assumption
> that the DWARF match the Clang AST (I think I am stating that right,
> correct me if not!).
>
> How many of these exist? I expect this list to be almost always empty.
> If it's not, why not? (Did I fail to push the commit to Clang that stopped
> adding all C++ types here??) Can we somehow prune the list?
>
Ah, you're right - I was probably operating under the old model that
everything was in retained types. It's good to know/see that's not the
case. I wonder what, if anything, we put in retained types these days.
(there certainly used to be ways - like cases where a type is used without
any variable: "((foo*)void_ptr)->x = 3;" - we could probably help in those
cases by pushing a retained types list down onto subprograms too, so the
uses were attached to subprograms - but that means potentially a lot of
duplication (if a type is 'retained' in many functions))
> I'm concerned that this is not a particularly fast thing to do.
>
> I'm also worried that this would cause major debug info quality
> degradation on Darwin (defer to Adrian on that...). I'd rather leave it
> unoptimized unless we're sure we need to do something here.
>
Turning definitions into declarations like that is incompatible with lldb -
it'd need to be turned off (Teresa mentioned putting it under a flag for
this reason) for any lldb platform.
> Also, why do you need this for -flto=thin? Why not just nullptr?
>
Legit point - same as enums, etc, they'll be emitted somewhere, no need to
retain them everywhere, even as declarations.
The proper definition->declaration demotion would have to happen by walking
all the debug info looking for types... :/
>
> > Implementation:
>
> Generally, I'd rather see the direction of explicitly building new
> DICompileUnits and linking in the correct/necessary arguments, like the
> IRLinker explicitly builds new llvm::Functions. Ideally, we'd only have to
> explicitly deal with DICompileUnit (one reason that I'm concerned about (e)
> above), which we can find through !llvm.dbg.cu.
>
> > a)-d) (enums/macros/global variables/imported entities):
> >
> > For a-d I have a simple solution in the IRLinker. At the very start of
> IRLinking in a module, if it is being linked in for function importing, I
> invoke a function that does the handling (i.e. from the IRLinker
> constructor).
> >
> > This routine handles a-c by pre-populating the ValueMap's MDMap entry
> for those Metadata* (e.g. the RawEnumTypes Metadata*) with nullptr, so
> metadata mapping automatically makes them nullptr on the imported
> DICompileUnit.
> >
> > For d, this routine walks the imported entities on the source CU and
> builds a SmallVector of those with local scopes. It then invokes
> replaceImportedEntities on the source CU to replace it with the new list
> (essentially dropping those with non-local scopes), and again the
> subsequent metadata mapping just works.
> >
> > e) (retained types):
> >
> > For e, changing the imported DICompisiteType to type declarations, I am
> having some issues. I prototyped this by doing it in the BitcodeReader. I
> passed in a new flag indicating that we are parsing a module for function
> importing. Then when parsing a METADATA_COMPOSITE_TYPE, if we are importing
> a module I change the calls to buildODRType and GET_OR_DISTINCT to instead
> create a type declaration:
> > - pass in flags Flags|DINode::FlagFwdDecl
> > - pass in nullptr for BaseType, Elements, VtableHolder, TemplateParams
> > - pass in 0 for OffsetInBits
> >
> > Note that buildODRType will not mutate any existing DICompisiteType for
> that identifier found in the DITypeMap to a type declaration (FlagFwdDecl)
> though. This is good since we are sharing the DITypeMap with the original
> (destination) module, and we don't want to change any type definitions on
> the existing destination (importing) module to be type declarations when
> the same type is also in the source module we are about to import.
> >
> > When cleaning this up, I initially tried mutating the types on the
> source module at the start of IRLinking of imported modules. I did this by
> adding a new DICompositeType function to force convert an existing type
> definition to a type declaration (since the bitcode reader already would
> have created a type definition when parsing the imported source module, and
> as I noted above it wasn't allowed to be mutated to a declaration after
> that point). However, this is wrong since it ended up changing type
> definitions that were also used by the original destination/importing
> module to type declarations if they were also used in the source module
> (and therefore shared a DITypeMap entry). To get the forced mutation to a
> type decl to work here, I would somehow need to detect when a composite
> type on the source module was *not* also used by the original destination
> module. The only way I came up with off the top of my head was to also do a
> DebugInfoFinder::processModule on the dest module, and subtract the
> resulting types() from those I found when finding types on the source
> module I'm about to import, and only force-convert the remaining ones to
> type decls.
> >
> >
> > Interested in feedback on any of the above, and in particular on the
> cleanest way to create type declarations for imported types.
> >
> > Thanks!
> > Teresa
> >
> > --
> > Teresa Johnson | Software Engineer | tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/15aea7fd/attachment.html>
More information about the llvm-dev
mailing list