[lldb-dev] Help fixing deadlock in DWARF symbol preloading
Raphael Isemann via lldb-dev
lldb-dev at lists.llvm.org
Thu Feb 4 17:37:27 PST 2021
We could also just give the Module a std::string with the description
and update it in the few places where we actually update it. The
m_arch already has a setter in place that just needs to be used in a
few more places, so the infrastructure is kind of already there (at
least for m_arch). The description would just have its own mutex.
Am Fr., 5. Feb. 2021 um 00:39 Uhr schrieb Jorge Gorbe Moya via
lldb-dev <lldb-dev at lists.llvm.org>:
>
> Wouldn't it be preferable to try_lock in GetDescription (which is the one currently acquiring the mutex) instead? ReportError doesn't touch any mutex itself and will happily report the rest of the error if GetDescription bails out. For the test case I sent it would look like this:
>
> error: {0x0000000b}: invalid abbreviation code 123, please file a bug and attach the file at the start of this error message
> error: {0x0000000b}: invalid abbreviation code 123, please file a bug and attach the file at the start of this error message
> error: {0x0000000b}: invalid abbreviation code 123, please file a bug and attach the file at the start of this error message
>
> which is way better than a deadlock IMO.
>
>
>
>
> On Thu, Feb 4, 2021 at 12:16 PM Pavel Labath <pavel at labath.sk> wrote:
>>
>> Please have a look at
>> <https://lists.llvm.org/pipermail/lldb-dev/2020-October/016479.html>,
>> which is the last time this came up.
>>
>>
>> One quick'n'dirty solution would be to have `Module::ReportError` _try_
>> to get the module lock, and if it fails, just bail out. That obviously
>> means you won't get to see the error message which triggerred the
>> deadlock (though we could also play around with that and try printing
>> the error message without the module description or something), but it
>> will at least get you past that point...
>>
>> pl
>>
>> On 04/02/2021 21:04, Jorge Gorbe Moya via lldb-dev wrote:
>> > Hi,
>> >
>> > I've found a deadlock in lldb (see attached test case, you can build it
>> > with just `clang -o test test.s`), but I'm a total newbie and I have no
>> > idea what's the right way to fix it.
>> >
>> > The problem happens when an error is found during DIE extraction when
>> > preloading symbols. As far as I can tell, it goes like this:
>> >
>> > 1. Module::PreloadSymbols locks Module::m_mutex
>> > 2. A few layers below it, we end up in ManualDWARFIndex::Index, which
>> > dispatches DIE extractions to a thread pool:
>> >
>> > |for (size_t i = 0; i < units_to_index.size(); ++i)
>> > pool.async(extract_fn, i); pool.wait(); |
>> >
>> > 3. extract_fn in the snippet above ends up executing
>> > DWARFDebugInfoEntry::Extract and when there's an error during
>> > extraction, Module::GetDescription is called while generating the error
>> > message.
>> > 4. Module::GetDescription tries to acquire Module::m_mutex from a
>> > different thread, while the main thread has the mutex already locked and
>> > it's waiting for DIE extraction to end, causing a deadlock.
>> >
>> > If we make Module::GetDescription not lock the problem disappears, so
>> > the diagnosis looks correct, but I don't know what would be the right
>> > way to fix it. Module::GetDescription looks more or less safe to call
>> > without locking: it just prints m_arch, m_file, and m_object_name to a
>> > string, and those look like fields that wouldn't change after the Module
>> > is initialized, so maybe it's okay? But I feel like there must be a
>> > better solution anyway. Any advice?
>> >
>> > Best,
>> > Jorge
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > lldb-dev mailing list
>> > lldb-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>> >
>>
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
More information about the lldb-dev
mailing list