[LLVMdev] Metadata/Value split has landed
Tom Stellard
tom at stellard.net
Thu Dec 11 09:05:52 PST 2014
On Thu, Dec 11, 2014 at 11:52:34AM -0500, Tom Stellard wrote:
> On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote:
> > +zalman at google.com
> >
>
> Hi Duncan,
>
> This patch plus another small change fixes the assertion failure for
> me. With the patch alone, the void* overload of addGarbageObject()
> was being used by MDNode::getTemporary(), so I had to cast the object as
> an MDNode*:
>
> diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp
> index cd5edd2..916d216 100644
> --- a/lib/IR/Metadata.cpp
> +++ b/lib/IR/Metadata.cpp
> @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context,
> ArrayRef<Metadata *> MDs,
> MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context,
> ArrayRef<Metadata *> MDs) {
> MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs);
> - LeakDetector::addGarbageObject(N);
> + LeakDetector::addGarbageObject((MDNode*)N);
> return N;
> }
Sorry, after more extensive testing, this doesn't work. It looks like
you need to add const MDNode * overloads to addGarbageObject() adding
them for addGarbageObjectImpl() doesn't seem to work:
diff --git a/include/llvm/IR/LeakDetector.h
b/include/llvm/IR/LeakDetector.h
index e0b131e..b272eaf 100644
--- a/include/llvm/IR/LeakDetector.h
+++ b/include/llvm/IR/LeakDetector.h
@@ -79,6 +79,17 @@ struct LeakDetector {
#endif
}
+ static void addGarbageObject(const MDNode *Object) {
+#ifndef NDEBUG
+ addGarbageObjectImpl(Object);
+#endif
+ }
+ static void removeGarbageObject(const MDNode *Object) {
+#ifndef NDEBUG
+ removeGarbageObjectImpl(Object);
+#endif
+ }
+
private:
// If we are debugging, the actual implementations will be called...
static void addGarbageObjectImpl(const Value *Object);
>
>
> I'm in favor of committing this.
>
> -Tom
>
>
> > > On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > >
> > >>
> > >> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote:
> > >>
> > >> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote:
> > >>>
> > >>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote:
> > >>>>
> > >>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote:
> > >>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the
> > >>>>> C++ side of it. This was a rocky day, but I suppose that's what I get
> > >>>>> for failing to stage the change in smaller pieces.
> > >>>>>
> > >>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) breakage,
> > >>>>> so if I've missed some problem in the sea of buildbot errors, please
> > >>>>> flag me down.
> > >>>>>
> > >>>>> I'll follow up soon with bitcode and assembly changes!
> > >>>>
> > >>>> Hi Duncan,
> > >>>>
> > >>>> I started getting random assertion failures in some tests yesterday, and I think
> > >>>> it may be related to this change. Here is the stack trace:
> > >>>>
> > >>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6
> > >>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6
> > >>>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6
> > >>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6
> > >>>> #4 0x00007ffff3a30e92 in llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #5 0x00007ffff3a30fd3 in llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #6 0x00007ffff3a40eed in llvm::MDNode::getTemporary(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #13 0x00007ffff3755786 in (anonymous namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) ()
> > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>> #16 0x00007ffff6c9d8cf in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>> #20 0x00007ffff6b5d179 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm (llvm_ctx=...,
> > >>>> source="\n__kernel void test_fn(__local float *sSharedStorage, __global float *srcValues, __global uint *offsets, __global float *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 );\n sSha"..., headers=..., name="input.cl", triple="r600--", processor="verde", opts="",
> > >>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, r_log=...) at llvm/invocation.cpp:255
> > >>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., r_log=...)
> > >>>> at llvm/invocation.cpp:710
> > >>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...)
> > >>>> at core/program.cpp:63
> > >>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, user_data=0x0)
> > >>>> at api/program.cpp:182
> > >>>>
> > >>>> Does this look related? If so, let me know what other information you need to
> > >>>> try to debug this issue.
> > >>>
> > >>> This could be related; I'm not sure.
> > >>>
> > >>
> > >> I'm pretty sure that this commit is the cause of the regression.
> > >>
> > >> r223801 works and r223810 does not, and I don't think any of the other
> > >> commits in that range could cause this.
> > >>
> > >>> It looks like a leak detection assertion, and I didn't need to change
> > >>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and
> > >>> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't
> > >>> touch the implementation of those).
> > >>>
> > >>> Can you reproduce this with `llvm-link`? If so, that sounds like the
> > >>> best place to start.
> > >>
> > >> I can't reproduce this using llvm-link unfortunately. Any other ideas?
> > >
> > > (Continuing via email, since Tom stepped away IRC.)
> > >
> > > Tom, from the trace [1], it the problematic pointer (0x27e4c80) only shows
> > > up once.
> > >
> > > [1]: http://people.freedesktop.org/~tstellar/md-crash.out
> > >
> > > That means that something *else* -- other than `MDNode::getTemporary()`
> > > -- must be adding that address to garbage and failing to remove it.
> > >
> > > I just dug into `LeakDetector::addGarbageObject()` and it stores *all*
> > > calls to `addGarbage()` in the same place. There are a fair number of
> > > these in the IR:
> > >
> > > $ git grep -e addGarbageObject -w -- lib/IR/
> > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this);
> > > lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this);
> > > lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N);
> > >
> > > I think the next step is to identify who called `addGarbageObject()` with
> > > the problematic address, and what the stack trace was.
> > >
> > > The weird thing is, I also noticed a semantic change I made here
> > > accidentally. `addGarbageObject()` has two overloads: `void*` and
> > > `const Value*`. `MDNode::getTemporary()` used to match the latter, but
> > > now it matches the former.
> > >
> > > The weird part: all the other calls to `addGarbageObject()` look like they
> > > send in a `Value *`.
> > >
> > > Do you have any other calls to `addGarbageObject()`? Do they match
> > > `void *`?
> > >
> > > Also, what happens with the attached patch? (If this fixes your problem,
> > > I think it's just papering over something...)
> > >
> > > <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch>
> >
> > Zalman also had a reproduction, and he's been able to track it down to
> > an `addGarbageObject()` call from `MachineBasicBlock`. It looks like
> > the MBB gets deallocated but `removeGarbageObject()` isn't called yet.
> >
> > CC'ing him here so he can join the thread once he's gotten a little
> > further.
> >
> > BTW, if this is blocking anyone, I can commit the patch I attached to
> > my previous email (or you can apply it locally). I think it's probably
> > the right thing eventually -- it improves the output when there *is* an
> > issue -- but I haven't committed it yet since it'll cover up the
> > problem.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list