[Lldb-commits] [PATCH] D75711: [NFC] Have ThreadPlans hold onto the Process & TID, rather than the Thread

Fri Mar 6 11:35:45 PST 2020

jingham added a comment.

In D75711#1909055 <https://reviews.llvm.org/D75711#1909055>, @labath wrote:

> I am somewhat worried about the direction this is taking. Maybe vanishing threads are fine for os-level debugging (because there the "real" threads are the cpu cores, and the os threads are somewhat imaginary), but I definitely wouldn't want to do it for regular debugging. If my process has 1000 threads (not completely uncommon) then I don't think my debugger would be doing me a service showing me just the three it thinks are interesting and completely disregarding the others. Having a mode (it could even be default) which only highlights the interesting ones -- yes, that would definitely be useful. Making sure it doesn't touch the other threads when it doesn't need to -- absolutely. But I don't think it should pretend the other threads don't exist at all, because I may still have a reason to inspect those threads (or just to know that they're there). In fact, I wouldn't be surprised if this happened to the said kernel folks too, and they end up adding a custom command, which would enumerate all "threads".

First off, I also think that hiding some threads from users, or forcing people to say "thread list --force" or something like that to see them all would not be a user-friendly design.  I'm not suggesting imposing that on general debugging.  But if an OS plugin, for instance, decides it is too expensive to do this except as an explicit gesture, we need to support that decision.

But lldb stops many times during the course of a step where it doesn't return control to the user - all the step into code with no debug info, step out again, step across dynamic linker stubs or step through ObjC message dispatch dances we do involve many private stops per public spot...  If a thread has not stopped "for a reason" at one of those stops, it takes no part in the "should stop" and "will resume" negotiations and reconstructing it for that stop really does serve no purpose.  There's no way that the user will get a chance to see it.

So if we could avoid reading them in, that might make stepping go faster in the presence of lots of threads.  BTW, with the accelerated stop packets where we get all the threads and their stop reasons on stop, I'm not sure this would really save all that much time.  But talking to less capable stubs, it really can make a difference.  Greg did a bunch of work to make sure stepping stopped as few times as possible because the Facebook gdb-remote stub didn't support these accelerated packets and getting all the thread info on each private stop was making stepping go really slowly.  I didn't understand the motivation at first because for debugserver the changes didn't really make any difference.

Also, I don't think these two desires: not fetching threads when we don't need them (e.g. private stops) and showing them all to the user whenever they ask, are contradictory.  After all we still have the gate of UpdateThreadListIfNeeded.  All the commands that touch the thread list go through this. So even if we don't gather all the threads when we stop, we already have a mechanism in place whereby any command that actually touches the thread list can fetch them on demand.

> What was actually the slow part here? Was it the os plugin reading the thread lists from the kernel memory? Or the fact that lldb is slow when it is confronted with them?
> 
> If it is the latter, then maybe we should fix that. For example, I think I remember lldb likes to check the PC of every thread before it does any kind of a resume -- I am not convinced that is really needed.
> 
> If fetching the threads is slow, then perhaps we could have an api, where the os threads are produced incrementally, and we ensure that lldb does not query for os threads needlessly during the hot paths (e.g. internal stops for servicing thread plans).

The thing that is slow is gathering all the threads.  On Darwin, there's a kernel activation per user space thread and then a whole bunch of kernel threads.  On a busy system, that ends up being a whole lot of threads.  Producing them through the OS plugin interface involves a bunch of memory reads to fetch the thread data, and following pointers around to get the register state for the threads, etc.

Note also, the way the OS Plugins currently work, when the process stops the OS Plugin just gets asked to produce the threads backing the "on core" threads.  That's all UpdateThreadListIfNeeded will fetch, and there's no lldb way to get the OS Plugin to fetch all the threads it can produce.  The only way to do that with the Darwin kernel is to run other commands in the set of Python commands provided by the kernel team that either fetches a non-current thread by ID or forces fetching all the threads.

So the current state when working with OS Plugins violates your principle given above, but I think it fits the kernel folks workflow, since the actual number of threads in the kernel is overwhelming and not terribly useful.  But you are right, if we wanted to delay getting threads on private stops, but make UpdateThreadListIfNeeded force a fetch, then we would have to add some API to allow us to fetch only the "on core" threads along the code path that handles internal stops.

In any case, in order to support having internal stops not fetch all threads - which we are currently doing with OS Plugins - we need to be able to persist the stateful data lldb holds for that thread across stops where we don't realize the Thread object.  That is the sole ambition of this set of patches.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75711/new/

https://reviews.llvm.org/D75711