[lldb-dev] strange behaviour at lldb cmd line
Greg Clayton
gclayton at apple.com
Thu Jun 26 17:21:12 PDT 2014
> On Jun 26, 2014, at 4:51 PM, Shawn Best <sbest at blueshiftinc.com> wrote:
>
> In addition to the (lldb) prompt out of order, I am also investigating some other strange messages when I run a simple application with no breakpoints. It seems related to thread synchronization surrounding the startup/management of the inferior process.
>
> (lldb) run
>
> Process 4417 launching
> Process 4417 stopped
> * thread #1: tid = 4417, 0x00007f3b99b9c2d0, name = 'a.out', stop reason = trace
> frame #0: 0x00007f3b99b9c2d0
> -> 0x7f3b99b9c2d0: movq %rsp, %rdi
> 0x7f3b99b9c2d3: callq 0x7f3b99b9fa70
> 0x7f3b99b9c2d8: movq %rax, %r12
> 0x7f3b99b9c2db: movl 0x221b17(%rip), %eax
>
> Process 4417 launched: '/home/shawn/Projects/a.out' (x86_64)
> Hello world!
> The string is Test String : 5
> Process 4417 exited with status = 0 (0x00000000)
> (lldb)
>
> ------------- or ----------------
>
> (lldb) run
>
> Process 4454 launching
> Process 4454 launched: '/home/shawn/Projects/a.out' (x86_64)
> Process 4454 stopped
> * thread #1: tid = 4454, 0x00007ffdec16c2d0, name = 'a.out', stop reason = trace
> frame #0: 0x00007ffdec16c2d0
> error: No such process
>
> Hello world!
> The string is Test String : 5
> Process 4454 exited with status = 0 (0x00000000)
> (lldb)
>
>
> As it is launching the target application, it appears to stop in a random place (stop reason = trace), and then continue exectuting. When it momentarily stops, I see it pop/push an IOHandler.
Yes the Process IO Handler is pushed and popped on every _public_ stop. There are notions of public stops that the user finds out about, and private stops where the Process might be in the process of trying to single step over a source line and might start/stop the process many many times.
This stopping at random locations seems like a racy bug in the ProcessLinux that we should really look into fixing.
>
> I added some logging to ProcessPOSIX, and see it hitting RefreshAfterStop() and DoResume() many times. Is this normal/expected?
When you start a process, you will run/stop many times as the shared libraries get loaded. Normally a breakpoint is set in the dynamic loader that allows us to intercept when shared libraries are loaded/unloaded so that may explain a few stops you are seeing.
Other run/stop flurries can result when single stepping over a source line, stepping past a software breakpoint (disable bp, single instruction step, re-enable breakpoint, resume).
> I have added a bunch of logging to Push/Pop IOHandler, ThreadCreate, HandleProcessEvent and see big differences in the order of events changing from run to run.
We have a lot of threading in LLDB so some of this will be normal, but other times in can indicate a bug much like you are seeing when the process stops at a random location 0x00007ffdec16c2d0. This could also be an uninitialized variable in ProcessLinux that gets a random value when ProcessLinux (or many other classes like ThreadLinux, etc) when a class instance is initialized. Please do try and track that down. To get a handle on process controls you can enable process and step logging:
(lldb) log enable -T -f /tmp/process.txt lldb process step
Then compare a good and bad run and see what differs.
>
>
> One other small thing, in POSIX/ProcessMonitor, it calls waitpid() and checks the return code,
>
> lldb::pid_t wpid;
> if ((wpid = waitpid(pid, &status, 0)) < 0)
> {
> args->m_error.SetErrorToErrno();
> goto FINISH;
> }
> else ...
>
> lldb::pid_t is a uint64, while waitpid returns an int32, with negative numbers used for error codes.
> This bug is repeated in a few places
This is bad, please use native types (::pid_t) for these locations so that this works correctly.
So a few things regarding your race conditions:
1 - on linux does a process start running first, then you quickly try to attach to it? If so, this could explain the difference you might be seeing when connecting to a process? On Darwin, our posix_spawn() has a non portable flag that stops the process at the entry point with a SIGSTOP so we are guaranteed to not have a race condition when we launch a process for debugging.
2 - The messages coming in out of order seem to be related to sending the eStateLaunching and eStateStopped not being delivered in the correct order. Your first example, they came through OK, and in the second cased we got a eStateStopped first followed by the eStateLaunching. I would take a look at who is sending these out of order. If you fix this out of order events, it might fix your random stopping at an wrong location?
Greg
>
> Shawn.
>
> On Fri, Jun 20, 2014 at 3:34 PM, Greg Clayton <gclayton at apple.com> wrote:
>
> > On Jun 19, 2014, at 7:27 PM, Ed Maste <emaste at freebsd.org> wrote:
> >
> > Hi Greg,
> >
> > As far as I can tell what's happening here is just that
> > Process::Resume() completes and the next prompt is emitted (from the
> > main-thread?) before the IOHandler gets pushed in another thread.
> >
> > Output from "log enable -n lldb process" with an added log printf
> > where ::Resume returns:
> >
> > step
> > main-thread Process::Resume -- locking run lock
> > main-thread Process::PrivateResume() m_stop_id = 4, public state:
> > stopped private state: stopped
> > main-thread Process::SetPrivateState (running)
> > main-thread Process thinks the process has resumed.
> > internal-state(p Process::ShouldBroadcastEvent (0x80c410480) => new
> > state: running, last broadcast state: running - YES
> > main-thread Process::PrivateResume() returning
> > (lldb) internal-state(p Process::HandlePrivateEvent (pid = 15646)
> > broadcasting new state running (old state stopped) to public
> > wait4(pid=15646) MonitorChildProcessThreadFunction ::waitpid (pid =
> > 15646, &status, options = 0) => pid = -15646, status = 0x0000057f
> > (STOPPED), signal = 5, exit_state = 0
> > internal-state(p PushIOHandler
> > wait4(pid=15646) Process::SetPrivateState (stopped)
> >
> > As before, I don't see how we intend to enforce synchronization
> > between those two threads. It looks like my tiny usleep in
> > ::PrivateResume delays the next prompt just long enough for the other
> > IOHandler to be pushed.
>
> That will do it. It is tough because Process::Resume() might not succeed so we can't always push the ProcessIOHandler.
>
> I need to find a better way to coordinate the pushing of the ProcessIOHandler so it happens from the same thread that initiates the resume. Then we won't have this issue, but I need to carefully do this so it doesn't push it when the process won't be resumed (since it might already be resumed) or in other edge cases.
>
> Other ideas would be to have the Process::Resume() do some synchronization between the current thread and the internal-state thread so it waits for the internal-state thread to get to the running state before it returns from Process::Resume()...
>
> Greg
>
>
More information about the lldb-dev
mailing list