<div dir="ltr"><br><br>On 16 March 2017 at 21:43, Kamil Rytarowski <<a href="mailto:n54@gmx.com">n54@gmx.com</a>> wrote:<br>> On 16.03.2017 11:55, Pavel Labath wrote:<br>>> What kind of per-process events<br>>> are we talking about here?<br>><br>> I'm mostly thinking about ResumeActions - to resume the whole process,<br>> while being able single-stepping desired thread(s).<br>><br>> (We also offer PT_SYSCALL feature, but it's not needed right now in LLDB).<br>><br>>> Is there anything more here than a signal<br>>> directed at the whole process?<br>><br>> single-stepping<br>> resume thread<br>> suspend thread<br>><br>> I'm evaluating FreeBSD-like API PT_SETSTEP/PT_CLEARSTEP for NetBSD. It<br>> marks a thread for single-stepping. This code is needed to allow us to<br>> combine PT_SYSCALL & PT_STEP and PT_STEP & emit signal.<br>><br>> I was thinking about ResumeActions marking which thread to<br>> resume/suspend/singlestep, whether to emit a signal (one per global<br>> PT_CONTINUE[/PT_SYSCALL]) and whether to resume the whole thread.<br>><br>> To some certain point it might be kludged with single-thread model for<br>> basic debugging.<br>><br>><br>> I imagined a possible flow of ResumeAction calls like:<br>> [Generic/Native framework knows upfront the image of threads within<br>> debuggee]<br>>  - Resume Thread 2 (PT_RESUME)<br>>  - Suspend Thread 3 (PT_SUSPEND)<br>>  - Set single-step Thread 2 (PT_SETSTEP)<br>>  - Set single-step Thread 4 (PT_SETSTEP)<br>>  - Clear single-step Thread 5 (PT_CLEARSTEP)<br>>  - Resume & emit signal SIGIO (PT_CONTINUE)<br>><br>> In other words: setting properties on threads and pushing the<br>> PT_CONTINUE button at the end.<br><br>None of this is really NetBSD-specific, except the whole-process signal at the end (which I am going to ignore for now). I mean, the implementation of it is different, but there is no reason why someone would not want to perform the same set of actions on Linux for instance. I think most of the work here should be done on the client. Then, when the user issues the final "continue", the client sends something like $vCont;s:2;s:4;c:5. Then it's up to the server to figure out how execute these actions. On NetBSD it would execute the operations you mention above, while on linux it would do something like ptrace(PTRACE_SINGLESTEP, 2); ptrace(PTRACE_SINGLESTEP, 4); ptrace(PTRACE_CONTINUE, 5); (linux lldb-server already supports this actually, although you may have a hard time convincing the client to send a packet like that).<br><br>So I don't believe there will be any sweeping changes necessary to support this in the future. If I understand it correctly, you are working on the server now. All you need to do there is to make sure you translate the set of actions in the packet to the proper sequence of ptrace calls. You can even write lldb-server-style tests for that. Then, we can discuss what would be the best user-level interface to specify complex actions like this, and teach the client to send these packets.<br><br>><br>>> AFAICT, most of the stop reasons<br>>> (breakpoint, watchpoint, single step, ...) are still linked to a<br>>> specific thread even in your process model. I think you could get to a<br>>> point where lldb is very useful even without getting these events<br>>> "correct".<br>>><br>><br>> I was thinking for example about this change (it's not following the<br>> real function name nor the prototype):<br>><br>>   GetStoppedReason(Thread) -> GetStoppedReason(Process,Thread)<br>><br>> The Linux code would easily route it to desired thread and (Net)BSD<br>> return immediately the requested data. The need to have these functions<br>> in NativeThread (enforced by the framework) is the only purpose I keep<br>> them there, while there is global stopped reason on NetBSD (per-process).<br><br>Ok, I think we can talk about tweaks like that once you have something upstream. Right now it does not seem to me like that should pose a big development obstacle.<br><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>In my local code, I'm populating all threads within the tracee<br>(NativeThread) with exactly the same stop reason - for the "whole<br>process" case. I can see - on the client side - that it prints out the<br>same message for each thread within the process as all of them captured<br>a stop action.</blockquote><div><br></div><div>Indeed, that can be a nuissance. The whole-process events is probably the first thing we should look at after the port is operational. I think this can be handled independently of the fancy resume actions we talk about above, which as Jim pointed out, would be very hard for users to comprehend anyway.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-size:12.8px">I'm evaluating it from the point of view of a tracee with 10.000 threads<br></span><span style="font-size:12.8px">and getting efficient debugging experience. This is why I would ideally<br></span><span style="font-size:12.8px">reduce NativeThread to a container that is sorted, hashale box of<br></span><span style="font-size:12.8px">integers (lwpid_t) and shut down stopped reason extension called for<br></span><span style="font-size:12.8px">each stopped in debuggee.<br></span> </blockquote><div><br></div><div>I wouldn't worry too much about the performance of this part of the code. If you get to the point where you debug a process with ten thousand threads, I think you'll find that there are other things which are causing performance problems.</div><div> </div></div><div class="gmail_extra"><br><div class="gmail_quote">On 16 March 2017 at 21:43, Kamil Rytarowski <span dir="ltr"><<a href="mailto:n54@gmx.com" target="_blank">n54@gmx.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 16.03.2017 11:55, Pavel Labath wrote:<br>

> On 16 March 2017 at 00:43, Kamil Rytarowski <<a href="mailto:n54@gmx.com">n54@gmx.com</a>> wrote:<br>

>><br>

</span><span class="">>> TODO:<br>

>>  - Fixing software breakpoints support,<br>

<br>

</span>Fixed!<br>

<br>

267->596 of succeeded tests out of 1200+ - please scroll for details.<br>

<span class=""><br>

>>  - Special Registers (Floating Point..) reading/writing,<br>

>>  - Unless it will be too closely related to develop threads - Hardware<br>

>> watchpoints support in line with the Linux amd64 code,<br>

>><br>

>><br>

>> As of today the number of passing tests has been degraded. This has been<br>

>> caused due the fact that LLDB endeavors to set breakpoints in every<br>

>> process (on the "_start" symbol) - this blocks tracing or simulating<br>

>> tracing of any process.<br>

> This is necessary so that we can read the list shared libraries loaded<br>

> by the process and set any breakpoints in them. Note that currently<br>

> (at least on Linux) we are doing it actually too late -- at this point<br>

> the constructors in the shared libraries have already executed, so we<br>

> cannot set breakpoints or debug the initialization code. I haven't yet<br>

> investigated how to fix this.<br>

><br>

<br>

</span>I see.<br>

<br>

It's interesting use-case; Right now I'm not sure how properly address it.<br>

<br>

Thank you for your insight.<br>

<span class=""><br>

> We will need to discuss this in detail. I am not sure removing the<br>

> NativeThreadNetBSD class completely will is a worthwhile goal, but we<br>

> can certainly work towards making it's parent class dumber, and remove<br>

> operations that don't make sense for all users. If e.g. your<br>

> watchpoints are per-process, then we can pipe watchpoint setting code<br>

> through NativeProcessProtocol, and NativeProcessNetBSD will implement<br>

> that directly, while the linux version will delegate to the thread.<br>

> However, even in your process model each thread has a separate set of<br>

> registers, so I think it makes sense to keep the register manipulation<br>

> code there.<br>

><br>

<br>

</span>I put all the threading potential challenges, each one will need to be<br>

discussed. Refactoring is by definition cost and should be reduced to<br>

minimum, while getting proper support on the platform. I think<br>

<br>

Our watchpoints (debug registers) are per-thread (LWP) only.<br>

<span class=""><br>

>>  - Support in the current thread function "0" (or "-1" according to the<br>

>> GDB Remote protocol) to mark that the whole process was interrupted/no<br>

>> primary thread (from a tracer point of view)<br>

> Teaching all parts of the debugger (server is not enough, I think you<br>

> would have to make a lot of client changes as well) about<br>

> whole-process events might be a big task.<br>

<br>

</span>I think long term this might be useful. I noted in the GDB Remote<br>

specification that this protocol is embeddable into simulators and<br>

low-level kernel APIs without regular threads, however it's not urgently<br>

needed to get aboard for standard user-level debugging facilities; while<br>

it will be useful in the general set of capabilities in future.<br>

<span class=""><br>

> I wondering whether you<br>

> wouldn't make more progress if you just fudged this and always<br>

> attributed these events to the primary thread. I think we would be in<br>

> a better position to design this properly once most of the debugger<br>

> functionality was operational for you.<br>

<br>

</span>Agreed.<br>

<br>

This is why the initial goal of mine is to get as far as possible<br>

without touching the generic subsystems and get basic threading support.<br>

<span class=""><br>

> What kind of per-process events<br>

> are we talking about here?<br>

<br>

</span>I'm mostly thinking about ResumeActions - to resume the whole process,<br>

while being able single-stepping desired thread(s).<br>

<br>

(We also offer PT_SYSCALL feature, but it's not needed right now in LLDB).<br>

<span class=""><br>

> Is there anything more here than a signal<br>

> directed at the whole process?<br>

<br>

</span>single-stepping<br>

resume thread<br>

suspend thread<br>

<br>

I'm evaluating FreeBSD-like API PT_SETSTEP/PT_CLEARSTEP for NetBSD. It<br>

marks a thread for single-stepping. This code is needed to allow us to<br>

combine PT_SYSCALL & PT_STEP and PT_STEP & emit signal.<br>

<br>

I was thinking about ResumeActions marking which thread to<br>

resume/suspend/singlestep, whether to emit a signal (one per global<br>

PT_CONTINUE[/PT_SYSCALL]) and whether to resume the whole thread.<br>

<br>

To some certain point it might be kludged with single-thread model for<br>

basic debugging.<br>

<br>

<br>

I imagined a possible flow of ResumeAction calls like:<br>

[Generic/Native framework knows upfront the image of threads within<br>

debuggee]<br>

 - Resume Thread 2 (PT_RESUME)<br>

 - Suspend Thread 3 (PT_SUSPEND)<br>

 - Set single-step Thread 2 (PT_SETSTEP)<br>

 - Set single-step Thread 4 (PT_SETSTEP)<br>

 - Clear single-step Thread 5 (PT_CLEARSTEP)<br>

 - Resume & emit signal SIGIO (PT_CONTINUE)<br>

<br>

In other words: setting properties on threads and pushing the<br>

PT_CONTINUE button at the end.<br>

<span class=""><br>

> AFAICT, most of the stop reasons<br>

> (breakpoint, watchpoint, single step, ...) are still linked to a<br>

> specific thread even in your process model. I think you could get to a<br>

> point where lldb is very useful even without getting these events<br>

> "correct".<br>

><br>

<br>

</span>I was thinking for example about this change (it's not following the<br>

real function name nor the prototype):<br>

<br>

  GetStoppedReason(Thread) -> GetStoppedReason(Process,<wbr>Thread)<br>

<br>

The Linux code would easily route it to desired thread and (Net)BSD<br>

return immediately the requested data. The need to have these functions<br>

in NativeThread (enforced by the framework) is the only purpose I keep<br>

them there, while there is global stopped reason on NetBSD (per-process).<br>

<br>

> cheers,<br>

> pl<br>

><br>

<br>

Thank you for your response.<br>

<br>

Last but not the least after getting software breakpoints to work the<br>

obligatory Test Summary diff between:<br>

<br>

<a href="http://netbsd.org/~kamil/lldb/check-lldb-r296360-2017-02-28.txt" rel="noreferrer" target="_blank">http://netbsd.org/~kamil/lldb/<wbr>check-lldb-r296360-2017-02-28.<wbr>txt</a><br>

<br>

and<br>

<br>

<a href="http://netbsd.org/~kamil/lldb/check-lldb-r296360-2017-03-16.txt" rel="noreferrer" target="_blank">http://netbsd.org/~kamil/lldb/<wbr>check-lldb-r296360-2017-03-16.<wbr>txt</a><br>

(pkgsrc-wip/lldb-netbsd git rev. 2c9c8e7b56d)<br>

<span class=""><br>

 ===================<br>

 Test Result Summary<br>

 ===================<br>

</span>-Test Methods:       1235<br>

-Reruns:                1<br>

-Success:             267<br>

+Test Methods:       1240<br>

+Reruns:                0<br>

+Success:             596<br>

 Expected Failure:     21<br>

-Failure:             332<br>

-Error:               167<br>

+Failure:              86<br>

+Error:                91<br>

<span class=""> Exceptional Exit:      0<br>

 Unexpected Success:    1<br>

 Skip:                444<br>

</span>-Timeout:               3<br>

+Timeout:               1<br>

 Expected Timeout:      0<br>

<br>

</blockquote></div><br></div>