[PATCH] D16440: [ThinLTO] Link in only necessary DICompileUnit operands

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 22 14:35:23 PDT 2016


On Mon, Mar 21, 2016 at 5:46 PM, Teresa Johnson <tejohnson at google.com>
wrote:

>
>
> On Tue, Mar 15, 2016 at 2:12 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Tue, Mar 15, 2016 at 1:48 PM, Teresa Johnson via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Thu, Mar 10, 2016 at 3:16 PM, David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>> dblaikie added inline comments.
>>>>
>>>
>>> Thanks for the comments! I had hoped to address these all before I left
>>> town, but ran out of time. A few responses below, will address the rest
>>> when I get back!
>>>
>>
>> Sure thing. No rush - my review approach is fairly "fire and forget until
>> it comes around again".
>>
>>
>>>
>>>
>>>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1386
>>>> @@ +1385,3 @@
>>>> +  // that is needed, and we should only invoke this for needed
>>>> composite types.
>>>> +  assert(RMI != RetainMap.end());
>>>> +  // If we already created a new composite type declaration, use it.
>>>> ----------------
>>>> I'm not sure this is true. Here's a type you might have to import that
>>>> is not in the retained types list:
>>>>
>>>>   namespace {
>>>>   struct foo { };
>>>>   void f(foo) { }
>>>>   }
>>>>   struct bar { };
>>>>   void f(bar) { }
>>>>   void f() {
>>>>     f(foo());
>>>>     f(bar());
>>>>   }
>>>>
>>>> The 'foo' type is not in the retained types list, but it may need to be
>>>> imported if 'f(foo)' is imported, yes?
>>>>
>>>
>>> While foo isn't in the retained types list, it is imported as it is
>>> reached via the DISubprogram for f(foo) and we do pull it in (confirmed
>>> with the above example and this patch). Perhaps the comment could be
>>> clarified to refer to needed retained types, not composite types, since
>>> this is only dealing with types reached via the retained types list.
>>>
>>
>> Hmm, following the algorithm is a bit difficult for me. The code appears
>> more complex than I would've expected.
>>
>> Perhaps I can explain the algorithm I would expect & you can point out
>> where I've oversimplified/got confused.
>>
>>
>> Basically what I'd expect is a loop over imported functions (OK, I see
>> something like that - a loop over subprograms, skipping any "unneeded"
>> subprograms around line 1474*).
>>
>
>
>> * What's the longer case there that says something about "for non
>> post-pass metadata linking" (1445-1466)?
>>
>
> Moving this question up here since the answer will hopefully help explain
> what is going on. For ThinLTO function importing there are two ways of
> handling metadata that are supported, described below. There are some
> tradeoffs based on how the importing behaves, and currently the former is
> used by llvm-link and the latter used by the function importing pass, but I
> need to do more experiments with large source files to see what is the best
> approach for some of our large apps, for example.
>
> The post-pass case was implemented first since I anticipated function
> importing to be iterative (import a single function at a time from a
> module, perhaps interleaved with functions imported from other modules). In
> that case the metadata is not materialized when we materialize and link in
> each imported source function. After we are done with function importing,
> we go back and materialize a single time the (module-level) metadata for
> each source module we imported from, and at that point link in all needed
> metadata (which requires some bookkeeping and suturing of the metadata
> referenced by the already-imported functions).
>
> When we are doing a single batch importing of functions from each module,
> we instead materialize the module-level metadata before we link in the
> imported functions. This is how full LTO behaves as well. In this case, as
> with full LTO, when the function bodies are linked in any metadata reached
> from those functions (e.g. the DISubprogram and descendants) is naturally
> linked in (see the MapMetadata call in IRLinker::linkFunctionBody).
> However, we still need to link in any types reached from the retained types
> list and referenced via name by any of the imported subprograms. That
> happens when we link in the CU named metadata.
>
>
>> For each imported subprogram, I'd imagine we would clone the subprogram,
>> and any types (as declarations) that we need.
>>
>
> As noted above, for the non-post-pass case (as with full LTO), this
> naturally done for direct referenced types when the function bodies are
> linked in. For post-pass, we need to do this here via
> the findReachedCompositeTypes call for each needed DISubprogram.
>

As I was going back to comment this more clearly I realized that for
post-pass metadata linking we have also already mapped in any needed
DISubprogram and any other types reached via direct reference. This is
because we invoke MapMetadata on the CU's subprogram list just above.
Confirmed that in the post-pass case we do have all such reached metadata
already in the ValueMap just as we do for the non-postpass case. I was able
to simplify the code to always use what was previously guarded by if
(!IsMetadataLinkingPostpass), and removed findReachedCompositeTypes().

I have addressed the other comments via improved names and/or comments, and
removed the offset and base type from the type declarations being created
(added a note about size and alignment).

Will upload the new patch momentarily.

Thanks,
Teresa


>   If an imported type requires another type, I imagine the act of
>> importing the first type would import the second
>>
>
> Yes when direct referenced.
>
>
>>   If any imported type is referenced by a type ref/string name, I expect
>> to resolve that and import it
>>
>
> That is what we do via the RetainWorkList.
>
>
>> But it seems like that's not quite the way the algorithm works (notably -
>> importing string and direct reference types aren't handled uniformly by the
>> type importing step - they're deferred into a list then handled separately,
>> but not quite understanding why?)
>>
>
> So that this is handled consistently regardless of whether we are doing
> post-pass or non-post-pass metadata linking.
>
> Hopefully that clarifies how this works, but let me know if not!
>
>
>>
>>
>>
>>>
>>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1398
>>>> @@ +1397,3 @@
>>>> +      Composite->getFile(), Composite->getLine(), NewScope,
>>>> NewBaseType,
>>>> +      Composite->getSizeInBits(), Composite->getAlignInBits(),
>>>> +      Composite->getOffsetInBits(), Composite->getFlags() |
>>>> DINode::FlagFwdDecl,
>>>> ----------------
>>>> Pretty sure you don't need the base type, size, alignment, and offset
>>>> in a declaration - check equivalent declarations that are generated by
>>>> clang? (check that the declarations this code creates look like natural
>>>> declarations in the original source language, etc)
>>>>
>>>
>>> They do seem to have size and alignment, but not baseType or offset.
>>> Will fix. Here's an example:
>>>
>>>  !1215 = !DICompositeType(tag: DW_TAG_class_type, name: "DOMText",
>>> scope: !524, file: !1216, line: 90, size: 64, align: 64, flags:
>>> DIFlagFwdDecl, identifier: "_ZTSN11xercesc_2_57DOMTextE")
>>>
>>
>> Oh, bother. Hrm. I wonder how those got there. I don't think they should
>> be. If we only have a declaration for a type there's no way we know the
>> correct size or alignment in general.
>>
>
> I can suppress these when I create the decl, but that seems inconsistent
> with what is currently emitted. WDYT?
>
>>
>>
>>>
>>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1411
>>>> @@ +1410,3 @@
>>>> +Metadata *
>>>> +IRLinker::importType(Metadata *Ty,
>>>> +                     DenseMap<DICompositeType *, DICompositeType *>
>>>> &RetainMap,
>>>> ----------------
>>>> This name seems a bit misleading. If I'm reading the code correctly,
>>>> this is used for any type's scope context - which might not be a type at
>>>> all. (it could be a namespace, for example) - and I assume the non-type
>>>> cases come out in the "map the type metadata normally" part?
>>>>
>>>
>>> Correct, it will also handle DINamespace which will be mapped normally
>>> via MapMetadata. I will rename this to something better.
>>>
>>
>>>
>>>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1427
>>>> @@ +1426,3 @@
>>>> +
>>>> +void IRLinker::linkImportedCompileUnit() {
>>>> +  NamedMDNode *SrcCompileUnits = SrcM.getNamedMetadata("llvm.dbg.cu");
>>>> ----------------
>>>> Name seems a bit off - this links in potentially multiple compile
>>>> units, not just a single one.
>>>>
>>>
>>> Right, will make the name plural.
>>>
>>> This is as far as I got. Will address the rest when I am back in town
>>> Monday!
>>>
>>>
>>>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1444
>>>> @@ +1443,3 @@
>>>> +    std::vector<DICompositeType *> RetainWorklist;
>>>> +    if (!IsMetadataLinkingPostpass) {
>>>> +      for (DIType *Ty : CU->getRetainedTypes()) {
>>>> ----------------
>>>> What's postpass metadata linking? (what's the alternative?)
>>>>
>>>
> Hopefully answered this above.
>
>
>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1445
>>>> @@ +1444,3 @@
>>>> +    if (!IsMetadataLinkingPostpass) {
>>>> +      for (DIType *Ty : CU->getRetainedTypes()) {
>>>> +        // For non-postpass metadata linking, any metadata referenced
>>>> ----------------
>>>> I'm not really following why we need to iterate the retained types list
>>>> - wouldn't we import everything starting from the subprograms we needed to
>>>> import, not the retained types list?
>>>>
>>>
> Hopefully answered above too.
>
>
>>
>>>> ================
>>>> Comment at: lib/Linker/IRMover.cpp:1534
>>>> @@ +1533,3 @@
>>>> +    stripNullSubprograms(NewCU);
>>>> +    DestCompileUnits->addOperand(NewCU);
>>>> +  }
>>>> ----------------
>>>> What if nothing was imported from this CU?
>>>>
>>>
> We only invoke the IRMover during importing if we are importing something
> from that module.
>
>
>>>> It seems to me we'd want to start at the imported functions, follow
>>>> their DISubprograms, and import that way, rather than walking all the CUs
>>>> and subprograms - maybe?
>>>>
>>>
> Hopefully answered above too.
>
>
>>
>>>>
>>>> http://reviews.llvm.org/D16440
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413
>



-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160322/0d4ce726/attachment.html>


More information about the llvm-commits mailing list