[lldb-dev] LLDB/NetBSD extended set of tasks

Fri Mar 17 02:48:24 PDT 2017

On 16 March 2017 at 21:43, Kamil Rytarowski <n54 at gmx.com> wrote:
> On 16.03.2017 11:55, Pavel Labath wrote:
>> What kind of per-process events
>> are we talking about here?
>
> I'm mostly thinking about ResumeActions - to resume the whole process,
> while being able single-stepping desired thread(s).
>
> (We also offer PT_SYSCALL feature, but it's not needed right now in LLDB).
>
>> Is there anything more here than a signal
>> directed at the whole process?
>
> single-stepping
> resume thread
> suspend thread
>
> I'm evaluating FreeBSD-like API PT_SETSTEP/PT_CLEARSTEP for NetBSD. It
> marks a thread for single-stepping. This code is needed to allow us to
> combine PT_SYSCALL & PT_STEP and PT_STEP & emit signal.
>
> I was thinking about ResumeActions marking which thread to
> resume/suspend/singlestep, whether to emit a signal (one per global
> PT_CONTINUE[/PT_SYSCALL]) and whether to resume the whole thread.
>
> To some certain point it might be kludged with single-thread model for
> basic debugging.
>
>
> I imagined a possible flow of ResumeAction calls like:
> [Generic/Native framework knows upfront the image of threads within
> debuggee]
>  - Resume Thread 2 (PT_RESUME)
>  - Suspend Thread 3 (PT_SUSPEND)
>  - Set single-step Thread 2 (PT_SETSTEP)
>  - Set single-step Thread 4 (PT_SETSTEP)
>  - Clear single-step Thread 5 (PT_CLEARSTEP)
>  - Resume & emit signal SIGIO (PT_CONTINUE)
>
> In other words: setting properties on threads and pushing the
> PT_CONTINUE button at the end.

None of this is really NetBSD-specific, except the whole-process signal at
the end (which I am going to ignore for now). I mean, the implementation of
it is different, but there is no reason why someone would not want to
perform the same set of actions on Linux for instance. I think most of the
work here should be done on the client. Then, when the user issues the
final "continue", the client sends something like $vCont;s:2;s:4;c:5. Then
it's up to the server to figure out how execute these actions. On NetBSD it
would execute the operations you mention above, while on linux it would do
something like ptrace(PTRACE_SINGLESTEP, 2); ptrace(PTRACE_SINGLESTEP, 4);
ptrace(PTRACE_CONTINUE, 5); (linux lldb-server already supports this
actually, although you may have a hard time convincing the client to send a
packet like that).

So I don't believe there will be any sweeping changes necessary to support
this in the future. If I understand it correctly, you are working on the
server now. All you need to do there is to make sure you translate the set
of actions in the packet to the proper sequence of ptrace calls. You can
even write lldb-server-style tests for that. Then, we can discuss what
would be the best user-level interface to specify complex actions like
this, and teach the client to send these packets.

>
>> AFAICT, most of the stop reasons
>> (breakpoint, watchpoint, single step, ...) are still linked to a
>> specific thread even in your process model. I think you could get to a
>> point where lldb is very useful even without getting these events
>> "correct".
>>
>
> I was thinking for example about this change (it's not following the
> real function name nor the prototype):
>
>   GetStoppedReason(Thread) -> GetStoppedReason(Process,Thread)
>
> The Linux code would easily route it to desired thread and (Net)BSD
> return immediately the requested data. The need to have these functions
> in NativeThread (enforced by the framework) is the only purpose I keep
> them there, while there is global stopped reason on NetBSD (per-process).

Ok, I think we can talk about tweaks like that once you have something
upstream. Right now it does not seem to me like that should pose a big
development obstacle.

> In my local code, I'm populating all threads within the tracee
> (NativeThread) with exactly the same stop reason - for the "whole
> process" case. I can see - on the client side - that it prints out the
> same message for each thread within the process as all of them captured
> a stop action.

Indeed, that can be a nuissance. The whole-process events is probably the
first thing we should look at after the port is operational. I think this
can be handled independently of the fancy resume actions we talk about
above, which as Jim pointed out, would be very hard for users to comprehend
anyway.

I'm evaluating it from the point of view of a tracee with 10.000 threads
> and getting efficient debugging experience. This is why I would ideally
> reduce NativeThread to a container that is sorted, hashale box of
> integers (lwpid_t) and shut down stopped reason extension called for
> each stopped in debuggee.
>

I wouldn't worry too much about the performance of this part of the code.
If you get to the point where you debug a process with ten thousand
threads, I think you'll find that there are other things which are causing
performance problems.

On 16 March 2017 at 21:43, Kamil Rytarowski <n54 at gmx.com> wrote:

> On 16.03.2017 11:55, Pavel Labath wrote:
> > On 16 March 2017 at 00:43, Kamil Rytarowski <n54 at gmx.com> wrote:
> >>
> >> TODO:
> >>  - Fixing software breakpoints support,
>
> Fixed!
>
> 267->596 of succeeded tests out of 1200+ - please scroll for details.
>
> >>  - Special Registers (Floating Point..) reading/writing,
> >>  - Unless it will be too closely related to develop threads - Hardware
> >> watchpoints support in line with the Linux amd64 code,
> >>
> >>
> >> As of today the number of passing tests has been degraded. This has been
> >> caused due the fact that LLDB endeavors to set breakpoints in every
> >> process (on the "_start" symbol) - this blocks tracing or simulating
> >> tracing of any process.
> > This is necessary so that we can read the list shared libraries loaded
> > by the process and set any breakpoints in them. Note that currently
> > (at least on Linux) we are doing it actually too late -- at this point
> > the constructors in the shared libraries have already executed, so we
> > cannot set breakpoints or debug the initialization code. I haven't yet
> > investigated how to fix this.
> >
>
> I see.
>
> It's interesting use-case; Right now I'm not sure how properly address it.
>
> Thank you for your insight.
>
> > We will need to discuss this in detail. I am not sure removing the
> > NativeThreadNetBSD class completely will is a worthwhile goal, but we
> > can certainly work towards making it's parent class dumber, and remove
> > operations that don't make sense for all users. If e.g. your
> > watchpoints are per-process, then we can pipe watchpoint setting code
> > through NativeProcessProtocol, and NativeProcessNetBSD will implement
> > that directly, while the linux version will delegate to the thread.
> > However, even in your process model each thread has a separate set of
> > registers, so I think it makes sense to keep the register manipulation
> > code there.
> >
>
> I put all the threading potential challenges, each one will need to be
> discussed. Refactoring is by definition cost and should be reduced to
> minimum, while getting proper support on the platform. I think
>
> Our watchpoints (debug registers) are per-thread (LWP) only.
>
> >>  - Support in the current thread function "0" (or "-1" according to the
> >> GDB Remote protocol) to mark that the whole process was interrupted/no
> >> primary thread (from a tracer point of view)
> > Teaching all parts of the debugger (server is not enough, I think you
> > would have to make a lot of client changes as well) about
> > whole-process events might be a big task.
>
> I think long term this might be useful. I noted in the GDB Remote
> specification that this protocol is embeddable into simulators and
> low-level kernel APIs without regular threads, however it's not urgently
> needed to get aboard for standard user-level debugging facilities; while
> it will be useful in the general set of capabilities in future.
>
> > I wondering whether you
> > wouldn't make more progress if you just fudged this and always
> > attributed these events to the primary thread. I think we would be in
> > a better position to design this properly once most of the debugger
> > functionality was operational for you.
>
> Agreed.
>
> This is why the initial goal of mine is to get as far as possible
> without touching the generic subsystems and get basic threading support.
>
> > What kind of per-process events
> > are we talking about here?
>
> I'm mostly thinking about ResumeActions - to resume the whole process,
> while being able single-stepping desired thread(s).
>
> (We also offer PT_SYSCALL feature, but it's not needed right now in LLDB).
>
> > Is there anything more here than a signal
> > directed at the whole process?
>
> single-stepping
> resume thread
> suspend thread
>
> I'm evaluating FreeBSD-like API PT_SETSTEP/PT_CLEARSTEP for NetBSD. It
> marks a thread for single-stepping. This code is needed to allow us to
> combine PT_SYSCALL & PT_STEP and PT_STEP & emit signal.
>
> I was thinking about ResumeActions marking which thread to
> resume/suspend/singlestep, whether to emit a signal (one per global
> PT_CONTINUE[/PT_SYSCALL]) and whether to resume the whole thread.
>
> To some certain point it might be kludged with single-thread model for
> basic debugging.
>
>
> I imagined a possible flow of ResumeAction calls like:
> [Generic/Native framework knows upfront the image of threads within
> debuggee]
>  - Resume Thread 2 (PT_RESUME)
>  - Suspend Thread 3 (PT_SUSPEND)
>  - Set single-step Thread 2 (PT_SETSTEP)
>  - Set single-step Thread 4 (PT_SETSTEP)
>  - Clear single-step Thread 5 (PT_CLEARSTEP)
>  - Resume & emit signal SIGIO (PT_CONTINUE)
>
> In other words: setting properties on threads and pushing the
> PT_CONTINUE button at the end.
>
> > AFAICT, most of the stop reasons
> > (breakpoint, watchpoint, single step, ...) are still linked to a
> > specific thread even in your process model. I think you could get to a
> > point where lldb is very useful even without getting these events
> > "correct".
> >
>
> I was thinking for example about this change (it's not following the
> real function name nor the prototype):
>
>   GetStoppedReason(Thread) -> GetStoppedReason(Process,Thread)
>
> The Linux code would easily route it to desired thread and (Net)BSD
> return immediately the requested data. The need to have these functions
> in NativeThread (enforced by the framework) is the only purpose I keep
> them there, while there is global stopped reason on NetBSD (per-process).
>
> > cheers,
> > pl
> >
>
> Thank you for your response.
>
> Last but not the least after getting software breakpoints to work the
> obligatory Test Summary diff between:
>
> http://netbsd.org/~kamil/lldb/check-lldb-r296360-2017-02-28.txt
>
> and
>
> http://netbsd.org/~kamil/lldb/check-lldb-r296360-2017-03-16.txt
> (pkgsrc-wip/lldb-netbsd git rev. 2c9c8e7b56d)
>
>  ===================
>  Test Result Summary
>  ===================
> -Test Methods:       1235
> -Reruns:                1
> -Success:             267
> +Test Methods:       1240
> +Reruns:                0
> +Success:             596
>  Expected Failure:     21
> -Failure:             332
> -Error:               167
> +Failure:              86
> +Error:                91
>  Exceptional Exit:      0
>  Unexpected Success:    1
>  Skip:                444
> -Timeout:               3
> +Timeout:               1
>  Expected Timeout:      0
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170317/f2caf14a/attachment-0001.html>