[LLVMdev] Metadata/Value split has landed

Tom Stellard tom at stellard.net
Thu Dec 11 17:41:51 PST 2014


On Thu, Dec 11, 2014 at 02:41:20PM -0800, Zalman Stern wrote:
> The assertion no longer fires in Halide with top-of-tree llvm. Thank you
> for the fix.

+1.  Thanks for the quick fix.

-Tom

> 
> -Z-
> 
> 
> On Thu, Dec 11, 2014 at 1:56 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> 
> > I committed:
> >
> > r224058 = 966942da9e68b59c31ce770e7f94c55a63482c6b
> > r224060 = da75f7277e3a129aed8ef8aa4e0d84de40b76fd4
> > r224061 = f88e4c8e9171045454b2c8e05054c2af8da3fe4f
> >
> > Let me know if somehow you're still hitting the problem.
> >
> > r224061 removes leak detection entirely from `MachineInstr`.  There aren't
> > any leaks to be had there, since they're allocated in a custom allocator.
> > They're just dropped away once `MachineFunction` is deleted.
> >
> > @Zalman, thanks again for your help digging into this.
> >
> > > On 2014 Dec 11, at 09:05, Tom Stellard <tom at stellard.net> wrote:
> > >
> > > On Thu, Dec 11, 2014 at 11:52:34AM -0500, Tom Stellard wrote:
> > >> On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote:
> > >>> +zalman at google.com
> > >>>
> > >>
> > >> Hi Duncan,
> > >>
> > >> This patch plus  another small change fixes the assertion failure for
> > >> me.  With the patch alone, the void* overload of addGarbageObject()
> > >> was being used by MDNode::getTemporary(), so I had to cast the object as
> > >> an MDNode*:
> > >>
> > >> diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp
> > >> index cd5edd2..916d216 100644
> > >> --- a/lib/IR/Metadata.cpp
> > >> +++ b/lib/IR/Metadata.cpp
> > >> @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context,
> > >> ArrayRef<Metadata *> MDs,
> > >> MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context,
> > >>                                     ArrayRef<Metadata *> MDs) {
> > >>   MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs);
> > >> -  LeakDetector::addGarbageObject(N);
> > >> +  LeakDetector::addGarbageObject((MDNode*)N);
> > >>   return N;
> > >> }
> > >
> > > Sorry, after more extensive testing, this doesn't work.  It looks like
> > > you need to add const MDNode * overloads to addGarbageObject() adding
> > > them for addGarbageObjectImpl() doesn't seem to work:
> > >
> > > diff --git a/include/llvm/IR/LeakDetector.h
> > > b/include/llvm/IR/LeakDetector.h
> > > index e0b131e..b272eaf 100644
> > > --- a/include/llvm/IR/LeakDetector.h
> > > +++ b/include/llvm/IR/LeakDetector.h
> > > @@ -79,6 +79,17 @@ struct LeakDetector {
> > > #endif
> > >   }
> > >
> > > +  static void addGarbageObject(const MDNode *Object) {
> > > +#ifndef NDEBUG
> > > +    addGarbageObjectImpl(Object);
> > > +#endif
> > > +  }
> > > +  static void removeGarbageObject(const MDNode *Object) {
> > > +#ifndef NDEBUG
> > > +    removeGarbageObjectImpl(Object);
> > > +#endif
> > > +  }
> > > +
> > > private:
> > >   // If we are debugging, the actual implementations will be called...
> > >   static void addGarbageObjectImpl(const Value *Object);
> > >>
> > >>
> > >> I'm in favor of committing this.
> > >>
> > >> -Tom
> > >>
> > >>
> > >>>> On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <
> > dexonsmith at apple.com> wrote:
> > >>>>
> > >>>>>
> > >>>>> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote:
> > >>>>>
> > >>>>> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith
> > wrote:
> > >>>>>>
> > >>>>>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote:
> > >>>>>>>
> > >>>>>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith
> > wrote:
> > >>>>>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at
> > least, the
> > >>>>>>>> C++ side of it.  This was a rocky day, but I suppose that's what
> > I get
> > >>>>>>>> for failing to stage the change in smaller pieces.
> > >>>>>>>>
> > >>>>>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree)
> > breakage,
> > >>>>>>>> so if I've missed some problem in the sea of buildbot errors,
> > please
> > >>>>>>>> flag me down.
> > >>>>>>>>
> > >>>>>>>> I'll follow up soon with bitcode and assembly changes!
> > >>>>>>>
> > >>>>>>> Hi Duncan,
> > >>>>>>>
> > >>>>>>> I started getting random assertion failures in some tests
> > yesterday, and I think
> > >>>>>>> it may be related to this change.  Here is the stack trace:
> > >>>>>>>
> > >>>>>>> #0  0x00007ffff59f4c39 in raise () from /lib64/libc.so.6
> > >>>>>>> #1  0x00007ffff59f6348 in abort () from /lib64/libc.so.6
> > >>>>>>> #2  0x00007ffff59edb96 in __assert_fail_base () from
> > /lib64/libc.so.6
> > >>>>>>> #3  0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6
> > >>>>>>> #4  0x00007ffff3a30e92 in
> > llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] ()
> > from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #5  0x00007ffff3a30fd3 in
> > llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/
> > libLLVM-3.6svn.so
> > >>>>>>> #6  0x00007ffff3a40eed in
> > llvm::MDNode::getTemporary(llvm::LLVMContext&,
> > llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/
> > libLLVM-3.6svn.so
> > >>>>>>> #7  0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*,
> > llvm::ValueMap<llvm::Value const*, llvm::WeakVH,
> > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&,
> > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #8  0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*,
> > llvm::ValueMap<llvm::Value const*, llvm::WeakVH,
> > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&,
> > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #9  0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*,
> > llvm::ValueMap<llvm::Value const*, llvm::WeakVH,
> > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&,
> > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*,
> > llvm::ValueMap<llvm::Value const*, llvm::WeakVH,
> > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&,
> > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*,
> > llvm::ValueMap<llvm::Value const*, llvm::WeakVH,
> > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&,
> > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #12 0x00007ffff3427174 in
> > llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value
> > const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*,
> > llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags,
> > llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #13 0x00007ffff3755786 in (anonymous
> > namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from
> > /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #14 0x00007ffff375767f in
> > llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/
> > libLLVM-3.6svn.so
> > >>>>>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*,
> > llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) ()
> > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so
> > >>>>>>> #16 0x00007ffff6c9d8cf in
> > clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from
> > /opt/buildbot/lib/libOpenCL.so.1
> > >>>>>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool,
> > bool) () from /opt/buildbot/lib/libOpenCL.so.1
> > >>>>>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() ()
> > from /opt/buildbot/lib/libOpenCL.so.1
> > >>>>>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from
> > /opt/buildbot/lib/libOpenCL.so.1
> > >>>>>>> #20 0x00007ffff6b5d179 in
> > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from
> > /opt/buildbot/lib/libOpenCL.so.1
> > >>>>>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm
> > (llvm_ctx=...,
> > >>>>>>> source="\n__kernel void test_fn(__local float *sSharedStorage,
> > __global float *srcValues, __global uint *offsets, __global float
> > *destBuffer, uint alignmentOffset )\n{\n    int tid = get_global_id( 0
> > );\n    sSha"..., headers=..., name="input.cl", triple="r600--",
> > processor="verde", opts="",
> > >>>>>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2,
> > r_log=...) at llvm/invocation.cpp:255
> > >>>>>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm
> > (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=...,
> > opts=..., r_log=...)
> > >>>>>>> at llvm/invocation.cpp:710
> > >>>>>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530,
> > devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...)
> > >>>>>>> at core/program.cpp:63
> > >>>>>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538,
> > num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0,
> > user_data=0x0)
> > >>>>>>> at api/program.cpp:182
> > >>>>>>>
> > >>>>>>> Does this look related?  If so, let me know what other information
> > you need to
> > >>>>>>> try to debug this issue.
> > >>>>>>
> > >>>>>> This could be related; I'm not sure.
> > >>>>>>
> > >>>>>
> > >>>>> I'm pretty sure that this commit is the cause of the regression.
> > >>>>>
> > >>>>> r223801 works and r223810 does not, and I don't think any of the
> > other
> > >>>>> commits in that range could cause this.
> > >>>>>
> > >>>>>> It looks like a leak detection assertion, and I didn't need to
> > change
> > >>>>>> that logic at all.  `ValueMap` calls `MDNode::getTemporary()` and
> > >>>>>> `MDNode::deleteTemporary()` in the same ways it used to (and I
> > didn't
> > >>>>>> touch the implementation of those).
> > >>>>>>
> > >>>>>> Can you reproduce this with `llvm-link`?  If so, that sounds like
> > the
> > >>>>>> best place to start.
> > >>>>>
> > >>>>> I can't reproduce this using llvm-link unfortunately.  Any other
> > ideas?
> > >>>>
> > >>>> (Continuing via email, since Tom stepped away IRC.)
> > >>>>
> > >>>> Tom, from the trace [1], it the problematic pointer (0x27e4c80) only
> > shows
> > >>>> up once.
> > >>>>
> > >>>> [1]: http://people.freedesktop.org/~tstellar/md-crash.out
> > >>>>
> > >>>> That means that something *else* -- other than
> > `MDNode::getTemporary()`
> > >>>> -- must be adding that address to garbage and failing to remove it.
> > >>>>
> > >>>> I just dug into `LeakDetector::addGarbageObject()` and it stores *all*
> > >>>> calls to `addGarbage()` in the same place.  There are a fair number of
> > >>>> these in the IR:
> > >>>>
> > >>>> $ git grep -e addGarbageObject -w -- lib/IR/
> > >>>> lib/IR/BasicBlock.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/BasicBlock.cpp:    LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Function.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Function.cpp:    LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Function.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Function.cpp:    LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Globals.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Globals.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Globals.cpp:    LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Globals.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Globals.cpp:    LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Instruction.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Instruction.cpp:  LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Instruction.cpp:    if (!P)
> > LeakDetector::addGarbageObject(this);
> > >>>> lib/IR/Metadata.cpp:  LeakDetector::addGarbageObject(N);
> > >>>>
> > >>>> I think the next step is to identify who called `addGarbageObject()`
> > with
> > >>>> the problematic address, and what the stack trace was.
> > >>>>
> > >>>> The weird thing is, I also noticed a semantic change I made here
> > >>>> accidentally.  `addGarbageObject()` has two overloads: `void*` and
> > >>>> `const Value*`.  `MDNode::getTemporary()` used to match the latter,
> > but
> > >>>> now it matches the former.
> > >>>>
> > >>>> The weird part: all the other calls to `addGarbageObject()` look like
> > they
> > >>>> send in a `Value *`.
> > >>>>
> > >>>> Do you have any other calls to `addGarbageObject()`?  Do they match
> > >>>> `void *`?
> > >>>>
> > >>>> Also, what happens with the attached patch?  (If this fixes your
> > problem,
> > >>>> I think it's just papering over something...)
> > >>>>
> > >>>> <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch>
> > >>>
> > >>> Zalman also had a reproduction, and he's been able to track it down to
> > >>> an `addGarbageObject()` call from `MachineBasicBlock`.  It looks like
> > >>> the MBB gets deallocated but `removeGarbageObject()` isn't called yet.
> > >>>
> > >>> CC'ing him here so he can join the thread once he's gotten a little
> > >>> further.
> > >>>
> > >>> BTW, if this is blocking anyone, I can commit the patch I attached to
> > >>> my previous email (or you can apply it locally).  I think it's probably
> > >>> the right thing eventually -- it improves the output when there *is* an
> > >>> issue -- but I haven't committed it yet since it'll cover up the
> > >>> problem.
> > >> _______________________________________________
> > >> LLVM Developers mailing list
> > >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >



More information about the llvm-dev mailing list