[lldb-dev] Deadlock loading DWARF symbols

Frédéric Riss via lldb-dev lldb-dev at lists.llvm.org
Wed Oct 14 20:12:57 PDT 2020


[I thought I had already sent this out weeks ago…] 

> On Oct 2, 2020, at 2:13 PM, Greg Clayton via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> Yes this is bad, and GetDescription() is used as a convenience to print out the module path (which might be a .o file within a .a file) and optionally architecture of the module. It probably shouldn't be taking the module lock as the only member variables that that GetDescription accesses are:
> 
> Module::m_arch
> Module::m_file
> Module::m_object_name
> 
> I would almost vote to take out the mutex lock in GetDescription() as the arch, file and name don't change after the module has been created. I am going to CC a few extra folks for discussion.
> 
> Anyone else have any objections to removing the mutex in GetDescription? Seems like this deadlock is easy to trigger if you have DWARF with errors or warnings inside of it.

I remember having a discussion with Jim about a very similar issue and IIRC, he told me that the arch could change after a module is created. I don’t remember the reason off the top of my head.

I think we are hitting a very similar deadlock in the Swift REPL for slightly different reasons (same lock though).

Fred 

> Greg
> 
> 
>> On Oct 2, 2020, at 6:50 AM, Dmitry Antipov via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>> 
>> I'm observing the following deadlock:
>> 
>> One thread calls Module::PreloadSymbols() which takes m_mutex of this Module. Module::PreloadSymbols()
>> calls ManualDWARFIndex::Index(), which, in turn, creates thread pool and waits for all threads completion:
>> 
>> (gdb)
>> #0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7f67f176914c) at ../sysdeps/nptl/futex-internal.h:183
>> #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f67f17690c8, cond=0x7f67f1769120) at pthread_cond_wait.c:508
>> #2  __pthread_cond_wait (cond=0x7f67f1769120, mutex=0x7f67f17690c8) at pthread_cond_wait.c:638
>> #3  0x00007f67f3974890 in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6
>> #4  0x00007f67f4440c4b in std::condition_variable::wait<llvm::ThreadPool::wait()::<lambda()> > (__p=..., __lock=..., this=0x7f67f1769120)
>>   at /usr/include/c++/10/condition_variable:108
>> #5  llvm::ThreadPool::wait (this=this at entry=0x7f67f1769060) at source/llvm/lib/Support/ThreadPool.cpp:72
>> #6  0x00007f67fc6fa3a6 in lldb_private::ManualDWARFIndex::Index (this=0x7f66fe87e950)
>>   at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:94
>> #7  0x00007f67fc6b3825 in SymbolFileDWARF::PreloadSymbols (this=0x7f67de7af6f0) at /usr/include/c++/10/bits/unique_ptr.h:421
>> #8  0x00007f67fc1ee488 in lldb_private::Module::PreloadSymbols (this=0x7f67de79b620) at source/lldb/source/Core/Module.cpp:1397
>> #9  0x00007f67fc397a37 in lldb_private::Target::GetOrCreateModule (this=this at entry=0x96c7a0, module_spec=..., notify=notify at entry=true, error_ptr=error_ptr at entry=0x0)
>>   at /usr/include/c++/10/bits/shared_ptr_base.h:1324
>> ...
>> 
>> OTOH one of pool threads makes an attempt to lock Module's mutex:
>> 
>> (gdb) bt
>> #0  __lll_lock_wait (futex=futex at entry=0x7f67de79b638, private=0) at lowlevellock.c:52
>> #1  0x00007f67fcd907f1 in __GI___pthread_mutex_lock (mutex=0x7f67de79b638) at ../nptl/pthread_mutex_lock.c:115
>> #2  0x00007f67fc1ed922 in __gthread_mutex_lock (__mutex=0x7f67de79b638) at /usr/include/c++/10/x86_64-redhat-linux/bits/gthr-default.h:749
>> #3  __gthread_recursive_mutex_lock (__mutex=0x7f67de79b638) at /usr/include/c++/10/x86_64-redhat-linux/bits/gthr-default.h:811
>> #4  std::recursive_mutex::lock (this=0x7f67de79b638) at /usr/include/c++/10/mutex:106
>> #5  std::lock_guard<std::recursive_mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/10/bits/std_mutex.h:159
>> #6  lldb_private::Module::GetDescription (this=this at entry=0x7f67de79b620, s=..., level=level at entry=lldb::eDescriptionLevelBrief)
>>   at source/lldb/source/Core/Module.cpp:1083
>> #7  0x00007f67fc1f2070 in lldb_private::Module::ReportError (this=0x7f67de79b620, format=0x7f67fca03660 "DW_FORM_ref* DIE reference 0x%lx is outside of its CU")
>>   at source/lldb/include/lldb/Utility/Stream.h:358
>> #8  0x00007f67fc6adfb4 in DWARFFormValue::Reference (this=this at entry=0x7f66f5ff29c0) at /usr/include/c++/10/bits/shared_ptr_base.h:1324
>> #9  0x00007f67fc6aaa77 in DWARFDebugInfoEntry::GetAttributes (this=this at entry=0x7f662e3580e0, cu=cu at entry=0x7f66ff6ebad0, attributes=...,
>>   recurse=recurse at entry=DWARFBaseDIE::Recurse::yes, curr_depth=curr_depth at entry=0)
>>   at source/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp:439
>> #10 0x00007f67fc6f8f8f in DWARFDebugInfoEntry::GetAttributes (recurse=DWARFBaseDIE::Recurse::yes, attrs=..., cu=0x7f66ff6ebad0, this=0x7f662e3580e0)
>>   at source/lldb/source/./Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h:54
>> #11 lldb_private::ManualDWARFIndex::IndexUnitImpl (unit=..., cu_language=cu_language at entry=lldb::eLanguageTypeRust, set=...)
>>   at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:180
>> #12 0x00007f67fc6f96b7 in lldb_private::ManualDWARFIndex::IndexUnit (this=<optimized out>, unit=..., dwp=0x0, set=...)
>>   at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:126
>> ...
>> 
>> So this is a deadlock because thread pool is created with module lock held, and one (or more,
>> I'm observing two) pool thread(s) might want to grab the same lock to issue an error message.
>> 
>> Commenting out the whole body of Module::GetDescription() makes this deadlock disappear.
>> 
>> I'm not an expert in this area, but the whole thing looks like the Module object should have more
>> fine-granted locking rather than the only std::recursive_mutex for all synchronization purposes.
>> 
>> Dmitry
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev



More information about the lldb-dev mailing list