[lldb-dev] Parallelizing loading of shared libraries

Fri Apr 28 08:04:54 PDT 2017

Hmmm ok, I don't like hard coding pools.  Your idea about limiting the
number of high level threads gave me an idea:

1. System has one high level TaskPool.
2. TaskPools have up to one child and one parent (the parent for the high
level TaskPool = nullptr).
3. When a worker starts up for a given TaskPool, it ensures a single child
exists.
4. There is a thread local variable that indicates which TaskPool that
thread enqueues into (via AddTask).  If that variable is nullptr, then it
is the high level TaskPool.Threads that are not workers enqueue into this
TaskPool.  If the thread is a worker thread, then the variable points to
the worker's child.
5. When creating a thread in a TaskPool, it's thread count AND the thread
count of the parent, grandparent, etc are incremented.
6. In the main worker loop, if there is no more work to do, OR the thread
count is too high, the worker "promotes" itself.  Promotion means:
a. decrement the thread count for the current task pool
b. if there is no parent, exit; otherwise, become a worker for the parent
task pool (and update the thread local TaskPool enqueue pointer).

The main points are:
1. We don't hard code the number of task pools; the code automatically uses
the fewest number of taskpools needed regardless of the number of places in
the code that want task pools.
2. When the child taskpools are busy, parent taskpools reduce their number
of workers over time to reduce oversubscription.

You can fiddle with the # of allowed threads per level; for example, if you
take into account number the height of the pool, and the number of child
threads, then you could allocate each level 1/2 of the number of threads as
the level below it, unless the level below wasn't using all the threads;
then the steady state would be 2 * cores, rather than height * cores.  I
think that it probably overkill though.

On Fri, Apr 28, 2017 at 4:37 AM, Pavel Labath <labath at google.com> wrote:

> On 27 April 2017 at 00:12, Scott Smith via lldb-dev
> <lldb-dev at lists.llvm.org> wrote:
> > After a dealing with a bunch of microoptimizations, I'm back to
> > parallelizing loading of shared modules.  My naive approach was to just
> > create a new thread per shared library.  I have a feeling some users may
> not
> > like that; I think I read an email from someone who has thousands of
> shared
> > libraries.  That's a lot of threads :-)
> >
> > The problem is loading a shared library can cause downstream
> parallelization
> > through TaskPool.  I can't then also have the loading of a shared library
> > itself go through TaskPool, as that could cause a deadlock - if all the
> > worker threads are waiting on work that TaskPool needs to run on a worker
> > thread.... then nothing will happen.
> >
> > Three possible solutions:
> >
> > 1. Remove the notion of a single global TaskPool, but instead have a
> static
> > pool at each callsite that wants it.  That way multiple paths into the
> same
> > code would share the same pool, but different places in the code would
> have
> > their own pool.
> >
>
> I looked at this option in the past and this was my preferred
> solution. My suggestion would be to have two task pools. One for
> low-level parallelism, which spawns
> std::thread::hardware_concurrency() threads, and another one for
> higher level tasks, which can only spawn a smaller number of threads
> (the algorithm for the exact number TBD). The high-level threads can
> access to low-level ones, but not the other way around, which
> guarantees progress.
>
> I propose to hardcode 2 pools, as I don't want to make it easy for
> people to create additional ones -- I think we should be having this
> discussion every time someone tries to add one, and have a very good
> justification for it (FWIW, I think your justification is good in this
> case, and I am grateful that you are pursuing this).
>
> pl
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170428/68e31e02/attachment-0001.html>