[llvm] r255909 - [ThinLTO] Metadata linking for imported functions

Duncan P. N. Exon Smith via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 10 13:23:47 PST 2016


> On 2016-Jan-08, at 11:21, Teresa Johnson <tejohnson at google.com> wrote:
> 
> On Tue, Jan 5, 2016 at 1:49 PM, Teresa Johnson <tejohnson at google.com> wrote:
>> On Tue, Jan 5, 2016 at 12:13 PM, Duncan P. N. Exon Smith
>> <dexonsmith at apple.com> wrote:
>>> 
>>>> On 2016-Jan-05, at 11:38, Teresa Johnson <tejohnson at google.com> wrote:
>>>> 
>>>> On Tue, Jan 5, 2016 at 9:07 AM, Teresa Johnson <tejohnson at google.com> wrote:
>>>>> On Mon, Jan 4, 2016 at 1:19 PM, Duncan P. N. Exon Smith
>>>>> <dexonsmith at apple.com> wrote:
>>>>>> 
>>>>>>> On 2015-Dec-29, at 18:00, Teresa Johnson <tejohnson at google.com> wrote:
>>>>>>> 
>>>>>>> On Wed, Dec 23, 2015 at 8:34 AM, Duncan P. N. Exon Smith
>>>>>>> <dexonsmith at apple.com> wrote:
>>>>>>>> 
>>>>>>>>> On 2015-Dec-17, at 12:14, Teresa Johnson via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>>>>>>>> 
>>>>>>>>> Author: tejohnson
>>>>>>>>> Date: Thu Dec 17 11:14:09 2015
>>>>>>>>> New Revision: 255909
>>>>>>>>> 
>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=255909&view=rev
>>>>>>>>> Log:
>>>>>>>>> [ThinLTO] Metadata linking for imported functions
>>>>>>>>> 
>>>>>>>>> Summary:
>>>>>>>>> Second patch split out from http://reviews.llvm.org/D14752.
>>>>>>>>> 
>>>>>>>>> Maps metadata as a post-pass from each module when importing complete,
>>>>>>>>> suturing up final metadata to the temporary metadata left on the
>>>>>>>>> imported instructions.
>>>>>>>>> 
>>>>>>>>> This entails saving the mapping from bitcode value id to temporary
>>>>>>>>> metadata in the importing pass, and from bitcode value id to final
>>>>>>>>> metadata during the metadata linking postpass.
>>>>>>>>> 
>>>>>>>>> Depends on D14825.
>>>>>>>>> 
>>>>>>>>> Reviewers: dexonsmith, joker.eph
>>>>>>>>> 
>>>>>>>>> Subscribers: davidxl, llvm-commits, joker.eph
>>>>>>>>> 
>>>>>>>>> Differential Revision: http://reviews.llvm.org/D14838
>>>>>>>>> 
>>>>>>>>> Added:
>>>>>>>>> llvm/trunk/test/Linker/Inputs/thinlto_funcimport_debug.ll
>>>>>>>>> llvm/trunk/test/Linker/thinlto_funcimport_debug.ll
>>>>>>>>> llvm/trunk/test/Transforms/FunctionImport/Inputs/funcimport_debug.ll
>>>>>>>>> llvm/trunk/test/Transforms/FunctionImport/funcimport_debug.ll
>>>>>>>>> Modified:
>>>>>>>>> llvm/trunk/include/llvm/IR/GVMaterializer.h
>>>>>>>>> llvm/trunk/include/llvm/IR/Metadata.h
>>>>>>>>> llvm/trunk/include/llvm/IRReader/IRReader.h
>>>>>>>>> llvm/trunk/include/llvm/Linker/IRMover.h
>>>>>>>>> llvm/trunk/include/llvm/Linker/Linker.h
>>>>>>>>> llvm/trunk/include/llvm/Transforms/Utils/ValueMapper.h
>>>>>>>>> llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp
>>>>>>>>> llvm/trunk/lib/IR/Metadata.cpp
>>>>>>>>> llvm/trunk/lib/IRReader/IRReader.cpp
>>>>>>>>> llvm/trunk/lib/Linker/IRMover.cpp
>>>>>>>>> llvm/trunk/lib/Linker/LinkModules.cpp
>>>>>>>>> llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp
>>>>>>>>> llvm/trunk/lib/Transforms/Utils/ValueMapper.cpp
>>>>>>>>> llvm/trunk/tools/llvm-link/llvm-link.cpp
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Modified: llvm/trunk/include/llvm/IR/Metadata.h
>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Metadata.h?rev=255909&r1=255908&r2=255909&view=diff
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/include/llvm/IR/Metadata.h (original)
>>>>>>>>> +++ llvm/trunk/include/llvm/IR/Metadata.h Thu Dec 17 11:14:09 2015
>>>>>>>>> @@ -832,10 +832,11 @@ public:
>>>>>>>>> /// \brief Resolve cycles.
>>>>>>>>> ///
>>>>>>>>> /// Once all forward declarations have been resolved, force cycles to be
>>>>>>>>> -  /// resolved.
>>>>>>>>> +  /// resolved. If \p MDMaterialized is true, then any temporary metadata
>>>>>>>>> +  /// is ignored, otherwise it asserts when encountering temporary metadata.
>>>>>>>> 
>>>>>>>> Please expand a little on the semantic difference between `true` and
>>>>>>>> `false` here.  In the `false` case, you're blocking identical `MDNode`s
>>>>>>>> from getting uniqued together.
>>>>>>> 
>>>>>>> Looks like I got the comment backwards. We ignore temp metadata when
>>>>>>> MDMaterialized==false, otherwise we assert when seeing temp metadata
>>>>>>> here (the default behavior). I don't think there is any other semantic
>>>>>>> difference - the flag only affects temporary MD, which we shouldn't
>>>>>>> normally see (which is why the default is the old behavior of
>>>>>>> asserting on temp metadata).
>>>>>> 
>>>>>> It's subtle, but I think it's important.  Consider the following.
>>>>> 
>>>>> Thanks for walking me through this, I see the issue. Comments below.
>>>>> 
>>>>>> --
>>>>>> The LLVMContext has:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>> 
>>>>>> a.bc contains:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>> --
>>>>>> The full-loader would read this in, and merge !1 and !2 together.
>>>>>> --
>>>>>> Let's say the lazy-loader chooses *only* to read !1 from a.bc,
>>>>>> leaving !2 as a temporary.  Now LLVMContext has:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>>   !3 = unresolved !{!4} // !1 from a.bc (with RAUW support)
>>>>>>   !4 = temporary !{}    // !2 from a.bc
>>>>>> 
>>>>>> Lazy-loader calls !3->resolveCycles(), giving:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>>   !3 = !{!4}         // !1 from a.bc (no RAUW support)
>>>>>>   !4 = temporary !{} // !2 from a.bc
>>>>> 
>>>>> Note that the lazy metadata reading support delays reading of all
>>>>> module-level metadata, which it later reads in a single pass. So the
>>>>> above situation would only happen if !1 from a.bc is in a
>>>>> function-level metadata block, and !2 is in its module-level metadata.
>>> 
>>> Interesting.  I hadn't thought through this specific use case.  Can you
>>> give a reduced example?
>> 
>> I looked at my test case (test/Linker/thinlto_funcimport_debug.ll) and
>> I see a couple of different cases where we end up with temp metadata
>> after the function importing. One case involves llvm.dbg.value
>> parameters (the DILocalVariable and a DIExpression), which are defined
>> in module-level metadata. So in this case it is the top-level metadata
>> node that is a temporary. The second case are the scope value on
>> DILocation metadata attachments. The scope is defined in module-level
>> metadata, but the DILocation itself is apparently a function-level
>> metadata attachment. Here is an example that shows both cases:
>> 
>> After importing a function, before metadata linking:
>> 
>>         tail call void @llvm.dbg.value(metadata i32 %n, i64 0,
>> metadata !15, metadata !16), !dbg !17
>> ...
>> !15 = <temporary!> !{}
>> !16 = <temporary!> !{}
>> !17 = !DILocation(line: 1, column: 15, scope: !14)
>> 
>> After (module-level) metadata linking:
>> 
>>  tail call void @llvm.dbg.value(metadata i32 %n, i64 0, metadata !15,
>> metadata !23), !dbg !24
>> ...
>> !11 = distinct !DISubprogram(name: "func1", scope: !9, file: !9, line:
>> 1, type: !12, isLocal: false, isDefinition: true, scopeLine: 1, flags:
>> DIFlagPrototyped, isOptimized: true, variables: !14)
>> ...
>> !15 = !DILocalVariable(name: "n", arg: 1, scope: !11, file: !9, line:
>> 1, type: !7)
>> ...
>> !23 = !DIExpression()
>> !24 = !DILocation(line: 1, column: 15, scope: !11)
>> 
>>> 
>>> Could/should the function-level metadata be delayed too?
>> 
>> I don't think so, unless there is a big issue with mapping in that
>> part when the function is imported. It is more efficient to do along
>> with the function. And unlike module-level where we don't know what
>> all is needed until all functions are imported, we know the function's
>> metadata will be needed.
>> 
>>> 
>>> Could/should we move more function-level metadata to the module-level?
>>> (Probably not...)
>> 
>> Not sure I know the answer to this one.
>> 
>>> 
>>>>>> Eventually, we read in !4.  !3 hits a uniquing collision, but
>>>>>> doesn't have RAUW support (since `resolveCycles()` dropped it).
>>>>>> End result:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>>   !3 = distinct !{!2} // !1 from a.bc
>>>>>> --
>>>>>> If `resolveCycles()` hadn't been called, we'd end up with:
>>>>>>   !1 = !{!2}
>>>>>>   !2 = !{}
>>>>>> since !3 could resolve the uniquing collision using RAUW.
>>>>> 
>>>>> I wonder if it would be possible to delay calling resolveCycles until
>>>>> the module-level metadata is loaded? That might result in other
>>>>> issues/differences though?
>>>> 
>>>> I tried doing this - in MDNode::resolveCycles I invoke a new helper
>>>> that checks if the MDNode or any operand recursively nested under it
>>>> contains a temporary, and if so, resolveCycles returns without doing
>>>> anything:
>>>> 
>>>> -------------------
>>>> static bool containsTemporaryImpl(MDNode *N,
>>>>                                 SmallPtrSetImpl<MDNode *> &Visited) {
>>>> if (!Visited.insert(N).second)
>>>>   return false;
>>>> if (N->isTemporary())
>>>>   return true;
>>>> for (auto &Op : N->operands()) {
>>>>   auto *OpN = dyn_cast_or_null<MDNode>(Op);
>>>>   if (OpN && containsTemporaryImpl(OpN, Visited))
>>> 
>>> I think you wouldn't want to descend if this node `.isResolved()`.
>> 
>> Ok
>> 
>>> 
>>>>     return true;
>>>> }
>>>> return false;
>>>> }
>>>> 
>>>> static bool containsTemporary(MDNode *N) {
>>>> SmallPtrSet<MDNode *, 8> Visited;
>>>> return containsTemporaryImpl(N, Visited);
>>>> }
>>>> 
>>>> void MDNode::resolveCycles() {
>>>> if (isResolved())
>>>>   return;
>>>> 
>>>> if (containsTemporary(this))
>>>>   return;
>>>> ...
>>>> ----------------------
>>>> 
>>>> So the resolveCycles would later be done when the module-level
>>>> metadata is read and mapped.
>>> 
>>> I'd prefer to keep the assertion in `resolveCycles()` to catch problems
>>> where the caller *thinks* there are no temporaries left.  However, for
>>> this use case you could (in theory) add helper function that returns
>>> `bool` and guard the call to `resolveCycles()` with it.
>> 
>> That makes sense. Do in the caller to guard the call or do it in a
>> separate interface like resolveNonTemporaries.
>> 
>>> 
>>>> This works in my test case (which does invoke resolveCycles with a
>>>> temp MD node), but admittedly I don't have a lot of ThinLTO debug test
>>>> cases. Does this seem like a reasonable approach to test further?
>>>> Otherwise, I can just change to having the two interfaces as you
>>>> suggested.
>>> 
>>> I think the two interfaces might be cleaner, since in other use cases
>>> the assertion is important.
>> 
>> Ok, agreed.
>> 
>>> Also, even in this use case, it's important
>>> to drop the RAUW support to keep memory usage in check.
>> 
>> I was hoping with the above helper we could avoid dropping the RAUW
>> support, so that in cases like your example we would end up with less
>> metadata at the end. If we still want to drop RAUW here, then no point
>> in skipping the resolving (just do your original suggestion of using a
>> separate well-documented interface). Do you think it will be important
>> to keep the memory down this way between function importing and the
>> metadata linking (which immediately follows after function importing
>> is complete) though?
> 
> Ping on this question.

Best way to evaluate would be to run an -flto -g build of something
somewhat large (clang (or even llc) should be big enough) and look at
the memory graph.

> I tried comparing the current implementation to the above change to
> skip resolveCycles on anything that reaches a temporary metadata
> (which as noted above affects the metadata mapping during function
> importing but is ultimately resolved during the bulk metadata linking
> that occurs after all functions are imported when the temp metadata is
> replaced).
> 
> I dumped out and compared the dest module just before and after the
> metadata postpass linking. I then compared that and the compile
> times/memory usage between the current implementation and the above
> change for a few cpu2006 benchmarks (mcf, bzip2, perlbench and astar).
> They all built fine with both versions of the code, and there was no
> noticeable consistent change in compilation time or memory. There was
> also no difference in the metadata between the two versions. I had
> expected that delaying the resolving of nodes might have some effects
> on the metadata (including avoiding the situation you described
> earlier that we can end up in with the current approach, by delaying
> dropping RAUW support).
> 
> So I am not sure what to conclude. Does the current code's semantic
> difference not really matter at all in practice? I.e. the potential
> situation you described with preventing RAUW of some metadata is
> really rare (although possible so agree that we want to change the
> interface to document the possibility). In that case delaying the
> resolving of cycles doesn't really matter/help? Theoretically it seems
> like it would help in that situation, but I haven't (yet) been able to
> provoke it.
> 
> Perhaps I should just go with your original suggestion of adding the
> new interface for the current behavior and documenting it, and not
> worrying about it unless we find issues. WDYT?

This seems fine to me.

> 
> Thanks,
> Teresa
>> 
>>> 
>>>> 
>>>> Thanks,
>>>> Tereas
>>>> 
>>>>> 
>>>>>> --
>>>>>> 
>>>>>> The new operation completely violates the spirit of
>>>>>> `resolveCycles()`, which (before your patch) was saying:
>>>>>>   "I promise there are no more temporaries, so remaining
>>>>>>   unresolved nodes must be part of a cycle."
>>>>>> The `assert()` confirms that everything is in a cycle.
>>>>>> 
>>>>>> With your new flag set, this has changed to:
>>>>>>   "I don't care about uniquing the remaining nodes, so drop
>>>>>>    RAUW support for them."
>>>>>> 
>>>>>> Maybe something like this would make sense:
>>>>>> --
>>>>>> void resolveCycles()
>>>>>> { resolveRecursivelyImpl(/* AllowTemps */ false); }
>>>>>> void resolveNonTemporaries()
>>>>>> { resolveRecursivelyImpl(/* AllowTemps */ true); }
>>>>>> 
>>>>>> private:
>>>>>> // Old resolveCycles() implementation.
>>>>>> void resolveRecursivelyImpl(bool AllowTemps);
>>>>>> --
>>>>>> (Do you have a better name for `resolveNonTemporaries()`?)
>>>>> 
>>>>> This sounds like a good idea. Don't have a better name suggestion,
>>>>> will use that.
>>>>> 
>>>>>> 
>>>>>> With the API split it might be easier to document the semantic
>>>>>> differences (just by describing the use case for each).
>>>>>> 
>>>>>> For `resolveCycles()`, please make clear (if it isn't already?)
>>>>>> that we're promising there aren't any temporaries, and thus
>>>>>> unresolved nodes are part of cycles and no longer need RAUW
>>>>>> support.
>>>>>> 
>>>>>> For `resolveNonTemporaries()`, please make clear that this is a
>>>>>> destructive operation that blocks uniquing of metadata nodes.
>>>>>> 
>>>>> 
>>>>> Will do.
>>>>> 
>>>>> Thanks,
>>>>> Teresa
>>>>> 
>>>>> --
>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>>> 
>> 
>> 
>> 
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
> 
> 
> 
> -- 
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413



More information about the llvm-commits mailing list