[lldb-dev] Parallelizing loading of shared libraries

Scott Smith via lldb-dev lldb-dev at lists.llvm.org
Tue May 2 09:22:31 PDT 2017


LLDB has TaskRunner and TaskPool.  TaskPool is nearly the same as
llvm::ThreadPool.  TaskRunner itself is a layer on top, though, and doesn't
seem to have an analogy in llvm.  Not that I'm defending TaskRunner....

I have written a new one called TaskMap.  The idea is that if all you want
is to call a lambda over the values 0 .. N-1, then it's more efficient to
use std::atomic<size_t> rather than various std::function with std::future
and std::bind and std::..... for each work item.  It is also a layer on top
of TaskPool, so it'd be easy to port to llvm::ThreadPool if that's how we
end up going. It ends up reducing lock contention within TaskPool without
needing to fall back on a lockfree queue.

On Tue, May 2, 2017 at 6:44 AM, Zachary Turner <zturner at google.com> wrote:

> Fwiw I haven't even followed the discussion closely enough to know what
> the issues with the lldb task runner even are.
>
> My motivation is simple though: don't reinvent the wheel.
>
> Iirc LLDB task runner was added before llvm's thread pool existed (I
> haven't checked, so i may be wrong about this). If that's the case, I would
> just assume replace all existing users of lldb task runner with llvm's as
> well and delete lldb's
>
> Regarding the issue with making debugging harder, llvm has functions to
> set thread name now. We could name all threadpool threads
> On Tue, May 2, 2017 at 3:05 AM Pavel Labath via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>
>> On 1 May 2017 at 22:58, Scott Smith <scott.smith at purestorage.com> wrote:
>> > On Mon, May 1, 2017 at 2:42 PM, Pavel Labath <labath at google.com> wrote:
>> >>
>> >> Besides, hardcoding the nesting logic into "add" is kinda wrong.
>> >> Adding a task is not the problematic operation, waiting for the result
>> >> of one is. Granted, generally these happen on the same thread, but
>> >> they don't have to be -- you can write a continuation-style
>> >> computation, where you do a bit of work, and then enqueue a task to do
>> >> the rest. This would create an infinite pool depth here.
>> >
>> >
>> > True, but that doesn't seem to be the style of code here.  If it were
>> you
>> > wouldn't need multiple pools, since you'd just wait for the callback
>> that
>> > your work was done.
>> >
>> >>
>> >>
>> >> Btw, are we sure it's not possible to solve this with just one thread
>> >> pool. What would happen if we changed the implementation of "wait" so
>> >> that if the target task is not scheduled yet, we just go ahead an
>> >> compute it on our thread? I haven't thought through all the details,
>> >> but is sounds like this could actually give better performance in some
>> >> scenarios...
>> >
>> >
>> > My initial reaction was "that wouldn't work, what if you ran another
>> posix
>> > dl load?"  But then I suppose *it* would run more work, and eventually
>> you'd
>> > run a leaf task and finish something.
>> >
>> > You'd have to make sure your work could be run regardless of what
>> mutexes
>> > the caller already had (since you may be running work for another
>> > subsystem), but that's probably not too onerous, esp given how many
>> > recursive mutexes lldb uses..
>>
>> Is it any worse that if the thread got stuck in the "wait" call? Even
>> with a dead-lock-free thread pool the task at hand still would not be
>> able to make progress, as the waiter  would hold the mutex even while
>> blocked (and recursiveness will not save you here).
>>
>> >
>> > I think that's all the more reason we *should* work on getting
>> something into LLVM first.  Anything we already have in LLDB, or any
>> modifications we make will likely not be pushed up to LLVM, especially
>> since LLVM already has a ThreadPool, so any changes you make to LLDB's
>> thread pool will likely have to be re-written when trying to get it to
>> LLVM.  And since, as you said, more projects depend on LLVM than LLDB,
>> there's a good chance that the baseline you'd be starting from when making
>> improvements is more easily adaptable to what you want to do.  LLDB has a
>> long history of being shy of making changes in LLVM where appropriate, and
>> myself and others have started pushing back on that more and more, because
>> it accumulates long term technical debt.
>> > In my experience, "let's just get it into LLDB first and then work on
>> getting it up to LLVM later" ends up being "well, it's in LLDB now, so
>> since my immediate problem is solved I may or may not have time to revisit
>> this in the future"  (even if the original intent is sincere).
>> > If there is some resistance getting changes into LLVM, feel free to add
>> me as a reviewer, and I can find the right people to move it along.  I'd
>> still like to at least hear a strong argument for why the existing
>> implementation in LLVM is unacceptable for what we need.  I'm ok with "non
>> optimal".  Unless it's "unsuitable", we should start there and make
>> incremental improvements.
>>
>> I think we could solve our current problem by just having two global
>> instances of llvm::ThreadPool. The only issue I have with that is that
>> I will then have 100 threads around constantly, which will make
>> debugging lldb harder (although even that can be viewed as an
>> incentive to make debugging threaded programs easier :) ).
>>
>> The reason I am not overly supportive of doing the design in llvm is
>> that I think we are trying to come up with a solution that will work
>> around issues with the lldb design, and I am not sure if that's the
>> right motivation. I am not against that either, though...
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170502/60215bd8/attachment-0001.html>


More information about the lldb-dev mailing list