[LLVMdev] Crash on invalid during LLVMContext destruction MDNode::dropAllReferences

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Jan 14 11:34:55 PST 2015


> On 2015-Jan-14, at 09:24, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Wed, Jan 14, 2015 at 9:05 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2015 Jan 14, at 07:58, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >
> >>
> >> On 2015 Jan 13, at 23:59, David Blaikie <dblaikie at gmail.com> wrote:
> >>
> >>
> >>
> >> On Tue, Jan 13, 2015 at 11:48 PM, Duncan Exon Smith <dexonsmith at apple.com> wrote:
> >> Not at a computer right now, but it looks like teardown isn't working correctly.  Do you have an asserts build?  Does an assertion fire there?
> >>
> >> That was with an asserts build.
> >>
> >> Looking at the stack trace, dropAllReferences() is being called on a node, so it sets its operands to nullptr, and some operand has RAUW support (so the tracking needs to be dropped) but looks like it might have been deleted or is otherwise corrupt.  Hard to tell though.
> >>
> >> Does this reproduce from preprocessed source?   Can you send it to me?
> >>
> >> Or maybe that's a test case in your email.  I'll try it in the morning.
> >>
> >> Yeah, just the test code in the original email is what I reproduced the linked list error with - some variations of it produced the assertion... maybe valgrinding or asanified clang would make the failure more reliable, etc.
> >>
> >
> > The version here doesn't repro for me (don't have an asan build handy --
> > I'll build one -- but I tried the weaker gmalloc).  I tried messing with
> > it but nothing happened.
> >
> > Can you send a version that gets the stack trace?
> >
> > (What revision is this, by the way?  ToT as of last night?)
> 
> Asan didn't find it either, and then I realized I was using the RUN line
> instead of the command-line you were using (with -g). 
> 
> Ah, right - sorry about that red herring.
>  
> So the asan dump follows.
> 
> Looks related to the stack I was seeing.
>  
> I'll look into this when I get to work.  Definitely from my stuff somehow.
> 
> OK - wasn't any particular rush for me, I just ran into this while writing test cases for my debug line quality stuff in clang (accidentally introduced a call to a function that didn't exist, which produced the crash).
> 
> Based on how I saw it, I'm guessing it's something to do with "LLVMContext has trouble being destroyed if <some task that we only do when successfully finishing codegen and is skipped if we abort codegen due to an error in the source> is not done first".
>  

Thanks for the report.  I'm really glad you told me now, since
moving MDLocation into place would have removed this way of
exposing it and it would have been way harder to track down
(MDLocation doesn't use ValueAsMetadata for its integer
operands, and that Value-indirection (with its RAUW behaviour)
is necessary to expose this).

This is the fix:

diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp
index ab83252..302ffb4 100644
--- a/lib/IR/Metadata.cpp
+++ b/lib/IR/Metadata.cpp
@@ -167,6 +171,8 @@ void ReplaceableMetadataImpl::replaceAllUsesWith(Metadata *MD) {
     return L.second.second < R.second.second;
   });
   for (const auto &Pair : Uses) {
+    if (!UseMap.count(Pair.first))
+      continue;
     OwnerTy Owner = Pair.second.first;
     if (!Owner) {
       // Update unowned tracking references directly.

I just need to figure out a formula for reproducing reliably (in
a unit test) and I'll commit.

Another thing this exposed (which would have also hidden this
problem) is that we're not proactively dropping references on
`ValueAsMetadata` during teardown.  We should.  This RAUW traffic
is unnecessary.  (It doesn't happen in the "common" case, where
the graph is complete (and RAUW has been dropped from MDTuples),
which is why this bug only fires when there's an error.)



More information about the llvm-dev mailing list