[Lldb-commits] [PATCH] D122975: parallelize module loading in DynamicLoaderPOSIXDYLD()

Mon Apr 4 09:02:44 PDT 2022

labath requested changes to this revision.
labath added a comment.
This revision now requires changes to proceed.

This is an interesting area for optimisation, but I don't think it's going to be that easy. A couple of years ago, I experimented with a similar approach, but I could not come up with a satisfactory solution in a reasonable amount of time. This is how the `PrefetchModuleSpecs` call came into being. I'm not entirely proud of it, but it was sufficient for my use case, and it is single threaded (which is always nice).

To move forward with this we'd basically need to resolve all of the issues that you mention in the patch description:

- threadpool-within-threadpool is a real problem. We shouldn't have n^2 threads running in the system. That will cause thrashing, possible OOMs, and just make it generally hard to figure out what's going on. I'm not sure what did you mean by the semaphore idea, but I don't think it would be sufficient to grab a semaphore in some thread before starting some work -- that'll still leave us with n^2 threads. We should prevent this many threads from being spawned in the first place.
- the thread safety claim is extremely bold. There's a big difference between running something in one thread (it doesn't matter that it's not the applications main thread -- this is running on a thread which is dedicated to serving one particular process), and spawning a bunch of threads to do something in parallel. It took us several years to fix all the corner cases relating to the dwarf index pooling, and that code was much more self-contained than this. The `Target::GetOrCreateModule` function (the core of this functionality) is pretty complex, consist of multiple levels of retries and lookups, and I don't think it would produce reasonable results if it was executed concurrently. I believe that in the normal case (all of the modules are distinct) it would do the right thing, but I'm pretty sure that it could do something strange and unexpected if the list contained (e.g.) the same module twice. The obvious fix (add a lock guard the manipulation of the target module list) would most likely defeat the main purpose of this patch. So, I think we would need to implement this function differently, or at least, provide some proof that current implementation is correct.

But before you go and try to do all of that (if this wasn't enough to discourage you :P), I'd like to understand what is the precise thing that you're trying to optimise. If there is like a single hot piece of code that we want to optimise, then we might be able to come up with an different approach (a'la PrefetchModuleSpecs) to achieve the same thing.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122975/new/

https://reviews.llvm.org/D122975