[llvm-dev] [ThinLTO] Reducing imported debug metadata

Wed Dec 7 10:14:49 PST 2016

A couple weeks ago I sat down with David Blaikie to figure out what we
could trim when importing debug metadata during function importing. I have
a prototype that reduces the resulting .o sizes significantly with ThinLTO
-g. However, I ran into some issues when cleaning up part of this in
preparation for a patch. Since Mehdi indicated that he and Adrian were
going to be looking at this as well shortly, I wanted to send out a summary
of what David Blaikie and I discussed, what I have right now, and the
issues I am hitting along with possible solutions.

There was a related older patch where I did some of the same things
(D16440: [ThinLTO] Link in only necessary DICompileUnit operands), but that
was made obsolete by a number of changes, such as the reversal of
GlobalVariable/DIGlobalVariable edges, and a number of debug metadata
changes done by Duncan (and Adrian I think?). So I started from scratch
this time.

Here's what I came up with after discussing what to import with David.
Specifically, how to handle fields when importing a DICompileUnit:

a) List of enum types
 - Not needed in the importing module: change to nullptr

b) List of macros
 - Not needed in the importing module: change to nullptr

c) List of global variables
- Only needed if we have imported the corresponding global variable
definition. Since we currently don't import any global variable definitions
(we should assert that there aren't any in the import list so this doesn't
get stale), we can change this to nullptr.

d) List of imported entities
- For now need to import those that have a DILocalScope (and drop others
from the imported entities list on the imported CU). David had some ideas
on restructuring the way these are referenced, but for now we
conservatively need to keep those that may be from functions, any that are
from functions not actually imported will not be emitted into the object
file anyway.

e) List of retained types
- Leave as-is but import DICompositeTypes as type declarations. This one
David thought would need to be under an option because of lldb's assumption
that the DWARF match the Clang AST (I think I am stating that right,
correct me if not!).

Implementation:

a)-d) (enums/macros/global variables/imported entities):

For a-d I have a simple solution in the IRLinker. At the very start of
IRLinking in a module, if it is being linked in for function importing, I
invoke a function that does the handling (i.e. from the IRLinker
constructor).

This routine handles a-c by pre-populating the ValueMap's MDMap entry for
those Metadata* (e.g. the RawEnumTypes Metadata*) with nullptr, so metadata
mapping automatically makes them nullptr on the imported DICompileUnit.

For d, this routine walks the imported entities on the source CU and builds
a SmallVector of those with local scopes. It then invokes
replaceImportedEntities on the source CU to replace it with the new list
(essentially dropping those with non-local scopes), and again the
subsequent metadata mapping just works.

e) (retained types):

For e, changing the imported DICompisiteType to type declarations, I am
having some issues. I prototyped this by doing it in the BitcodeReader. I
passed in a new flag indicating that we are parsing a module for function
importing. Then when parsing a METADATA_COMPOSITE_TYPE, if we are importing
a module I change the calls to buildODRType and GET_OR_DISTINCT to instead
create a type declaration:
 - pass in flags Flags|DINode::FlagFwdDecl
 - pass in nullptr for BaseType, Elements, VtableHolder, TemplateParams
 - pass in 0 for OffsetInBits

Note that buildODRType will not mutate any existing DICompisiteType for
that identifier found in the DITypeMap to a type declaration (FlagFwdDecl)
though. This is good since we are sharing the DITypeMap with the original
(destination) module, and we don't want to change any type definitions on
the existing destination (importing) module to be type declarations when
the same type is also in the source module we are about to import.

When cleaning this up, I initially tried mutating the types on the source
module at the start of IRLinking of imported modules. I did this by adding
a new DICompositeType function to force convert an existing type definition
to a type declaration (since the bitcode reader already would have created
a type definition when parsing the imported source module, and as I noted
above it wasn't allowed to be mutated to a declaration after that point).
However, this is wrong since it ended up changing type definitions that
were also used by the original destination/importing module to type
declarations if they were also used in the source module (and therefore
shared a DITypeMap entry). To get the forced mutation to a type decl to
work here, I would somehow need to detect when a composite type on the
source module was *not* also used by the original destination module. The
only way I came up with off the top of my head was to also do a
DebugInfoFinder::processModule on the dest module, and subtract the
resulting types() from those I found when finding types on the source
module I'm about to import, and only force-convert the remaining ones to
type decls.

Interested in feedback on any of the above, and in particular on the
cleanest way to create type declarations for imported types.

Thanks!
Teresa

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/b159bcc7/attachment.html>