[Lldb-commits] [PATCH] D158034: [lldb] Fix data race in ThreadList
Pavel Labath via Phabricator via lldb-commits
lldb-commits at lists.llvm.org
Tue Aug 22 02:53:01 PDT 2023
labath added a comment.
After this patch, python_api/run_locker/TestRunLocker.py becomes flaky. https://lab.llvm.org/buildbot/#/builders/68/builds/58456 is the first such failure, but there have been about a dozen failures since then. The backtraces on the buildbot page are fairly useless, but I was able to capture this backtrace locally:
include/c++/v1/vector:1537: assertion __n < size() failed: vector[] index out of bounds
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0 0x000055dd180146be llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 46
1 0x000055dd18014d6c
2 0x00007fea9b3ce1c0
3 0x00007fea9b236347 raise + 167
4 0x00007fea9b237797 abort + 247
5 0x000055dd18862597
6 0x000055dd124c4563 lldb_private::process_gdb_remote::ProcessGDBRemote::SetThreadPc(std::__u::shared_ptr<lldb_private::Thread> const&, unsigned long) + 275
7 0x000055dd124c42fc lldb_private::process_gdb_remote::ProcessGDBRemote::DoUpdateThreadList(lldb_private::ThreadList&, lldb_private::ThreadList&) + 748
8 0x000055dd129c8687 lldb_private::Process::UpdateThreadListIfNeeded() + 247
9 0x000055dd12a39797 lldb_private::ThreadList::FindThreadByID(unsigned long, bool) + 55
10 0x000055dd12a396db lldb_private::ThreadList::GetSelectedThread() + 43
11 0x000055dd129a8dea lldb_private::ExecutionContext::ExecutionContext(lldb_private::Target*, bool) + 234
12 0x00007fea99840bf9 lldb::SBTarget::EvaluateExpression(char const*, lldb::SBExpressionOptions const&) + 281
13 0x00007fea99840a7b lldb::SBTarget::EvaluateExpression(char const*) + 187
14 0x00007fea9995eb77
15 0x000055dd185907cd
16 0x000055dd18545814 _PyObject_Call + 292
17 0x000055dd1865fecb _PyEval_EvalFrameDefault + 11515
18 0x000055dd1865cfa8 _PyEval_Vector + 184
19 0x000055dd186666c3
20 0x000055dd18666906
21 0x000055dd1865e56f _PyEval_EvalFrameDefault + 5023
22 0x000055dd1865cfa8 _PyEval_Vector + 184
23 0x000055dd186666c3
24 0x000055dd18666906
25 0x000055dd1865e56f _PyEval_EvalFrameDefault + 5023
26 0x000055dd1865cfa8 _PyEval_Vector + 184
27 0x000055dd1865fecb _PyEval_EvalFrameDefault + 11515
28 0x000055dd1865cfa8 _PyEval_Vector + 184
<a lot of additional python frames>
Given that the failure is in the `DoUpdateThreadList` function, I'm guessing that this patch removed/reduced some existing synchronization which prevented this code from executing concurrently with something else (I don't know what), although it's possible that the right fix (since updating the thread list while the process is running doesn't make sense) is to bail out of `EvaluateExpression` sooner (@jingham ?).
If you want to reproduce this, note that the bug reproduces relatively infrequently (~1% for me, though it seems to be a bit higher on the buildbot), so you may need to run it many times before you can catch it in action. The failing part is not always the same (e.g. sometimes it just hangs), but I expect the root cause to be the same. I've also confirmed that the failure rate of the test goes down to zero after reverting this patch.
Can you investigate (and possibly revert this patch in the mean time)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158034/new/
https://reviews.llvm.org/D158034
More information about the lldb-commits
mailing list