[llvm] r255909 - [ThinLTO] Metadata linking for imported functions

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 5 11:38:38 PST 2016


On Tue, Jan 5, 2016 at 9:07 AM, Teresa Johnson <tejohnson at google.com> wrote:
> On Mon, Jan 4, 2016 at 1:19 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>>
>>> On 2015-Dec-29, at 18:00, Teresa Johnson <tejohnson at google.com> wrote:
>>>
>>> On Wed, Dec 23, 2015 at 8:34 AM, Duncan P. N. Exon Smith
>>> <dexonsmith at apple.com> wrote:
>>>>
>>>>> On 2015-Dec-17, at 12:14, Teresa Johnson via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>>>>
>>>>> Author: tejohnson
>>>>> Date: Thu Dec 17 11:14:09 2015
>>>>> New Revision: 255909
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=255909&view=rev
>>>>> Log:
>>>>> [ThinLTO] Metadata linking for imported functions
>>>>>
>>>>> Summary:
>>>>> Second patch split out from http://reviews.llvm.org/D14752.
>>>>>
>>>>> Maps metadata as a post-pass from each module when importing complete,
>>>>> suturing up final metadata to the temporary metadata left on the
>>>>> imported instructions.
>>>>>
>>>>> This entails saving the mapping from bitcode value id to temporary
>>>>> metadata in the importing pass, and from bitcode value id to final
>>>>> metadata during the metadata linking postpass.
>>>>>
>>>>> Depends on D14825.
>>>>>
>>>>> Reviewers: dexonsmith, joker.eph
>>>>>
>>>>> Subscribers: davidxl, llvm-commits, joker.eph
>>>>>
>>>>> Differential Revision: http://reviews.llvm.org/D14838
>>>>>
>>>>> Added:
>>>>>   llvm/trunk/test/Linker/Inputs/thinlto_funcimport_debug.ll
>>>>>   llvm/trunk/test/Linker/thinlto_funcimport_debug.ll
>>>>>   llvm/trunk/test/Transforms/FunctionImport/Inputs/funcimport_debug.ll
>>>>>   llvm/trunk/test/Transforms/FunctionImport/funcimport_debug.ll
>>>>> Modified:
>>>>>   llvm/trunk/include/llvm/IR/GVMaterializer.h
>>>>>   llvm/trunk/include/llvm/IR/Metadata.h
>>>>>   llvm/trunk/include/llvm/IRReader/IRReader.h
>>>>>   llvm/trunk/include/llvm/Linker/IRMover.h
>>>>>   llvm/trunk/include/llvm/Linker/Linker.h
>>>>>   llvm/trunk/include/llvm/Transforms/Utils/ValueMapper.h
>>>>>   llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp
>>>>>   llvm/trunk/lib/IR/Metadata.cpp
>>>>>   llvm/trunk/lib/IRReader/IRReader.cpp
>>>>>   llvm/trunk/lib/Linker/IRMover.cpp
>>>>>   llvm/trunk/lib/Linker/LinkModules.cpp
>>>>>   llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp
>>>>>   llvm/trunk/lib/Transforms/Utils/ValueMapper.cpp
>>>>>   llvm/trunk/tools/llvm-link/llvm-link.cpp
>>>>>
>>>>>
>>>>> Modified: llvm/trunk/include/llvm/IR/Metadata.h
>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Metadata.h?rev=255909&r1=255908&r2=255909&view=diff
>>>>> ==============================================================================
>>>>> --- llvm/trunk/include/llvm/IR/Metadata.h (original)
>>>>> +++ llvm/trunk/include/llvm/IR/Metadata.h Thu Dec 17 11:14:09 2015
>>>>> @@ -832,10 +832,11 @@ public:
>>>>>  /// \brief Resolve cycles.
>>>>>  ///
>>>>>  /// Once all forward declarations have been resolved, force cycles to be
>>>>> -  /// resolved.
>>>>> +  /// resolved. If \p MDMaterialized is true, then any temporary metadata
>>>>> +  /// is ignored, otherwise it asserts when encountering temporary metadata.
>>>>
>>>> Please expand a little on the semantic difference between `true` and
>>>> `false` here.  In the `false` case, you're blocking identical `MDNode`s
>>>> from getting uniqued together.
>>>
>>> Looks like I got the comment backwards. We ignore temp metadata when
>>> MDMaterialized==false, otherwise we assert when seeing temp metadata
>>> here (the default behavior). I don't think there is any other semantic
>>> difference - the flag only affects temporary MD, which we shouldn't
>>> normally see (which is why the default is the old behavior of
>>> asserting on temp metadata).
>>
>> It's subtle, but I think it's important.  Consider the following.
>
> Thanks for walking me through this, I see the issue. Comments below.
>
>> --
>> The LLVMContext has:
>>     !1 = !{!2}
>>     !2 = !{}
>>
>> a.bc contains:
>>     !1 = !{!2}
>>     !2 = !{}
>> --
>> The full-loader would read this in, and merge !1 and !2 together.
>> --
>> Let's say the lazy-loader chooses *only* to read !1 from a.bc,
>> leaving !2 as a temporary.  Now LLVMContext has:
>>     !1 = !{!2}
>>     !2 = !{}
>>     !3 = unresolved !{!4} // !1 from a.bc (with RAUW support)
>>     !4 = temporary !{}    // !2 from a.bc
>>
>> Lazy-loader calls !3->resolveCycles(), giving:
>>     !1 = !{!2}
>>     !2 = !{}
>>     !3 = !{!4}         // !1 from a.bc (no RAUW support)
>>     !4 = temporary !{} // !2 from a.bc
>
> Note that the lazy metadata reading support delays reading of all
> module-level metadata, which it later reads in a single pass. So the
> above situation would only happen if !1 from a.bc is in a
> function-level metadata block, and !2 is in its module-level metadata.
>
>>
>> Eventually, we read in !4.  !3 hits a uniquing collision, but
>> doesn't have RAUW support (since `resolveCycles()` dropped it).
>> End result:
>>     !1 = !{!2}
>>     !2 = !{}
>>     !3 = distinct !{!2} // !1 from a.bc
>> --
>> If `resolveCycles()` hadn't been called, we'd end up with:
>>     !1 = !{!2}
>>     !2 = !{}
>> since !3 could resolve the uniquing collision using RAUW.
>
> I wonder if it would be possible to delay calling resolveCycles until
> the module-level metadata is loaded? That might result in other
> issues/differences though?

I tried doing this - in MDNode::resolveCycles I invoke a new helper
that checks if the MDNode or any operand recursively nested under it
contains a temporary, and if so, resolveCycles returns without doing
anything:

-------------------
static bool containsTemporaryImpl(MDNode *N,
                                  SmallPtrSetImpl<MDNode *> &Visited) {
  if (!Visited.insert(N).second)
    return false;
  if (N->isTemporary())
    return true;
  for (auto &Op : N->operands()) {
    auto *OpN = dyn_cast_or_null<MDNode>(Op);
    if (OpN && containsTemporaryImpl(OpN, Visited))
      return true;
  }
  return false;
}

static bool containsTemporary(MDNode *N) {
  SmallPtrSet<MDNode *, 8> Visited;
  return containsTemporaryImpl(N, Visited);
}

void MDNode::resolveCycles() {
  if (isResolved())
    return;

  if (containsTemporary(this))
    return;
...
----------------------

So the resolveCycles would later be done when the module-level
metadata is read and mapped.

This works in my test case (which does invoke resolveCycles with a
temp MD node), but admittedly I don't have a lot of ThinLTO debug test
cases. Does this seem like a reasonable approach to test further?
Otherwise, I can just change to having the two interfaces as you
suggested.

Thanks,
Tereas

>
>> --
>>
>> The new operation completely violates the spirit of
>> `resolveCycles()`, which (before your patch) was saying:
>>     "I promise there are no more temporaries, so remaining
>>     unresolved nodes must be part of a cycle."
>> The `assert()` confirms that everything is in a cycle.
>>
>> With your new flag set, this has changed to:
>>     "I don't care about uniquing the remaining nodes, so drop
>>      RAUW support for them."
>>
>> Maybe something like this would make sense:
>> --
>> void resolveCycles()
>> { resolveRecursivelyImpl(/* AllowTemps */ false); }
>> void resolveNonTemporaries()
>> { resolveRecursivelyImpl(/* AllowTemps */ true); }
>>
>> private:
>> // Old resolveCycles() implementation.
>> void resolveRecursivelyImpl(bool AllowTemps);
>> --
>> (Do you have a better name for `resolveNonTemporaries()`?)
>
> This sounds like a good idea. Don't have a better name suggestion,
> will use that.
>
>>
>> With the API split it might be easier to document the semantic
>> differences (just by describing the use case for each).
>>
>> For `resolveCycles()`, please make clear (if it isn't already?)
>> that we're promising there aren't any temporaries, and thus
>> unresolved nodes are part of cycles and no longer need RAUW
>> support.
>>
>> For `resolveNonTemporaries()`, please make clear that this is a
>> destructive operation that blocks uniquing of metadata nodes.
>>
>
> Will do.
>
> Thanks,
> Teresa
>
> --
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413


More information about the llvm-commits mailing list