[llvm-dev] [ThinLTO] Reducing imported debug metadata

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 7 11:08:18 PST 2016


On Wed, Dec 7, 2016 at 11:02 AM Teresa Johnson <tejohnson at google.com> wrote:

> +pcc to answer question about global variable list on CU
>
> Thanks for the comments, responses below.
>
> Teresa
>
> On Wed, Dec 7, 2016 at 10:36 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>
> > On 2016-Dec-07, at 10:14, Teresa Johnson <tejohnson at google.com> wrote:
> >
> > A couple weeks ago I sat down with David Blaikie to figure out what we
> could trim when importing debug metadata during function importing. I have
> a prototype that reduces the resulting .o sizes significantly with ThinLTO
> -g. However, I ran into some issues when cleaning up part of this in
> preparation for a patch. Since Mehdi indicated that he and Adrian were
> going to be looking at this as well shortly, I wanted to send out a summary
> of what David Blaikie and I discussed, what I have right now, and the
> issues I am hitting along with possible solutions.
> >
> > There was a related older patch where I did some of the same things
> (D16440: [ThinLTO] Link in only necessary DICompileUnit operands), but that
> was made obsolete by a number of changes, such as the reversal of
> GlobalVariable/DIGlobalVariable edges, and a number of debug metadata
> changes done by Duncan (and Adrian I think?). So I started from scratch
> this time.
> >
> > Here's what I came up with after discussing what to import with David.
> Specifically, how to handle fields when importing a DICompileUnit:
>
> Do you have a profile of which (if any) of these is causing the debug info
> bloat?
>
> I worry/suspect that much of the bloat could be in the DILocation chains
> (line-table), which are generally non-mergeable.
>
>
> Here is a breakdown of the debug section sizes aggregated across all .o
> files for a build of Chromium with -g2. These are both with ThinLTO, but
> the first column is with all function importing disabled:
>
>   No importing Normal importing (aka "Head" in the below graph)
> .debug_abbrev 32013702 60570915
> .debug_info 1714741154 8724603593 <(872)%20460-3593>
> .debug_line 108033535 440090737
> .debug_loc 178426363 264907421
> .debug_macinfo 17820     17820
> .debug_pubnames 251081673 661630495
> .debug_pubtypes 284086653 1867377375
> .debug_ranges 65470016 106560864
> .debug_str 3304122338 <(330)%20412-2338> 19013084756 <(901)%20308-4756>
>
> [image: image.png]
>
> I'm not sure if this answers your question above about DILocation chains
> though - would that be directly translated into .debug_loc size?
>
>
> > a) List of enum types
> >  - Not needed in the importing module: change to nullptr
>
> Seems reasonable for -flto=thin (incorrect for -flto=full, though).
>
>
> Right, all this is for ThinLTO - I haven't thought about Full LTO as I am
> concerned about ThinLTO importing-specific bloat.
>
>
>
> > b) List of macros
> >  - Not needed in the importing module: change to nullptr
>
> We don't have macros by default, right?  You need some -gmacros flag I
> hope?  I remember objecting to the massive debug info bloat when they were
> introduced, but was convinced by an argument along the lines of
> "it's-never-going-to-be-on-by-default".
>
>
> No I haven't seen a case where this is non-null in my ThinLTO testing, but
> it seemed safe to do for completeness.
>
>
> > c) List of global variables
> > - Only needed if we have imported the corresponding global variable
> definition. Since we currently don't import any global variable definitions
> (we should assert that there aren't any in the import list so this doesn't
> get stale), we can change this to nullptr.
>
> Why does this still exist?  Can't we delete this list in all modes, since
> GlobalValues reference their own DIGlobalVariables directly?
>
>
> Peter, can you answer this question? I looked back at the old thread and
> ISTR that there were some cases where the reference from the CU was needed
> but don't remember the details.
>
>
>
> > d) List of imported entities
> > - For now need to import those that have a DILocalScope (and drop others
> from the imported entities list on the imported CU). David had some ideas
> on restructuring the way these are referenced, but for now we
> conservatively need to keep those that may be from functions, any that are
> from functions not actually imported will not be emitted into the object
> file anyway.
>
> I'd rather just restructure these if we can.
>
>
> That sounds like a good long-term solution. For for the short term what I
> did was incredibly simple. If it has a significant impact (haven't measured
> this one independently), I think the short-term fix would be good in the
> meantime. David can probably provide more details on what needs to be done
> to fix this longer term.
>
>
> > e) List of retained types
> > - Leave as-is but import DICompositeTypes as type declarations. This one
> David thought would need to be under an option because of lldb's assumption
> that the DWARF match the Clang AST (I think I am stating that right,
> correct me if not!).
>
> How many of these exist?  I expect this list to be almost always empty.
> If it's not, why not?  (Did I fail to push the commit to Clang that stopped
> adding all C++ types here??)  Can we somehow prune the list?
>
>
> It seems that there are quite a lot. Just doing this caused a huge
> reduction on sizes. E.g. on Chromium, importing DICompositeType as type
> decls alone dropped the aggregate .o size increase with -g2 over plain -O2
> from 5.2x (current ThinLTO) to 1.7x.
>
>
> I'm concerned that this is not a particularly fast thing to do.
>
>
> In what way?
>
>
>
> I'm also worried that this would cause major debug info quality
> degradation on Darwin (defer to Adrian on that...).  I'd rather leave it
> unoptimized unless we're sure we need to do something here.
>
>
> Maybe this is the issue David was referring to about potentially needing
> to put this under an option to disable for lldb. For us apparently it is
> not an issue.
>
>
> Also, why do you need this for -flto=thin?  Why not just nullptr?
>
>
> Meaning never import retained types? I think we need the type decls at
> least - David?
>

I believe Duncan's right here, and I was mistaken - we may be able to just
null out the retained types list.

I'd be curious what's in there that your change makes such a big difference
- now that Duncan's corrected my misunderstanding/outdated info about
what's in the retained types list.


>
>
>
> > Implementation:
>
> Generally, I'd rather see the direction of explicitly building new
> DICompileUnits and linking in the correct/necessary arguments, like the
> IRLinker explicitly builds new llvm::Functions.  Ideally, we'd only have to
> explicitly deal with DICompileUnit (one reason that I'm concerned about (e)
> above), which we can find through !llvm.dbg.cu.
>
> > a)-d) (enums/macros/global variables/imported entities):
> >
> > For a-d I have a simple solution in the IRLinker. At the very start of
> IRLinking in a module, if it is being linked in for function importing, I
> invoke a function that does the handling (i.e. from the IRLinker
> constructor).
> >
> > This routine handles a-c by pre-populating the ValueMap's MDMap entry
> for those Metadata* (e.g. the RawEnumTypes Metadata*) with nullptr, so
> metadata mapping automatically makes them nullptr on the imported
> DICompileUnit.
> >
> > For d, this routine walks the imported entities on the source CU and
> builds a SmallVector of those with local scopes. It then invokes
> replaceImportedEntities on the source CU to replace it with the new list
> (essentially dropping those with non-local scopes), and again the
> subsequent metadata mapping just works.
> >
> > e) (retained types):
> >
> > For e, changing the imported DICompisiteType to type declarations, I am
> having some issues. I prototyped this by doing it in the BitcodeReader. I
> passed in a new flag indicating that we are parsing a module for function
> importing. Then when parsing a METADATA_COMPOSITE_TYPE, if we are importing
> a module I change the calls to buildODRType and GET_OR_DISTINCT to instead
> create a type declaration:
> >  - pass in flags Flags|DINode::FlagFwdDecl
> >  - pass in nullptr for BaseType, Elements, VtableHolder, TemplateParams
> >  - pass in 0 for OffsetInBits
> >
> > Note that buildODRType will not mutate any existing DICompisiteType for
> that identifier found in the DITypeMap to a type declaration (FlagFwdDecl)
> though. This is good since we are sharing the DITypeMap with the original
> (destination) module, and we don't want to change any type definitions on
> the existing destination (importing) module to be type declarations when
> the same type is also in the source module we are about to import.
> >
> > When cleaning this up, I initially tried mutating the types on the
> source module at the start of IRLinking of imported modules. I did this by
> adding a new DICompositeType function to force convert an existing type
> definition to a type declaration (since the bitcode reader already would
> have created a type definition when parsing the imported source module, and
> as I noted above it wasn't allowed to be mutated to a declaration after
> that point). However, this is wrong since it ended up changing type
> definitions that were also used by the original destination/importing
> module to type declarations if they were also used in the source module
> (and therefore shared a DITypeMap entry). To get the forced mutation to a
> type decl to work here, I would somehow need to detect when a composite
> type on the source module was *not* also used by the original destination
> module. The only way I came up with off the top of my head was to also do a
> DebugInfoFinder::processModule on the dest module, and subtract the
> resulting types() from those I found when finding types on the source
> module I'm about to import, and only force-convert the remaining ones to
> type decls.
> >
> >
> > Interested in feedback on any of the above, and in particular on the
> cleanest way to create type declarations for imported types.
> >
> > Thanks!
> > Teresa
> >
> > --
> > Teresa Johnson |       Software Engineer |     tejohnson at google.com |
> 408-460-2413
>
>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/fcfb996e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 20356 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/fcfb996e/attachment.png>


More information about the llvm-dev mailing list