[lldb-dev] Parallelizing loading of shared libraries

Scott Smith via lldb-dev lldb-dev at lists.llvm.org
Thu Apr 27 11:54:00 PDT 2017

Hmm, turns out I was wrong about delayed symbol loading not working under
Linux.  I've added timings to the review.

On Thu, Apr 27, 2017 at 11:12 AM, Jim Ingham <jingham at apple.com> wrote:

> Interesting.  Do you have to catch this information as the JIT modules get
> loaded, or can you recover the data after-the-fact?  For most uses, I don't
> think you need to track JIT modules as they are loaded, but it would be
> good enough to refresh the list on stop.
> Jim
> > On Apr 27, 2017, at 10:51 AM, Pavel Labath <labath at google.com> wrote:
> >
> > It's the gdb jit interface breakpoint. I don't think there is a good
> > way to scope that to a library, as that symbol can be anywhere...
> >
> >
> > On 27 April 2017 at 18:35, Jim Ingham via lldb-dev
> > <lldb-dev at lists.llvm.org> wrote:
> >> Somebody is probably setting an internal breakpoint for some purpose
> w/o scoping it to the shared library it's to be found in.  Either that or
> somebody has broken lazy loading altogether.  But that's not intended
> behavior.
> >>
> >> Jim
> >>
> >>> On Apr 27, 2017, at 7:02 AM, Scott Smith <scott.smith at purestorage.com>
> wrote:
> >>>
> >>> So as it turns out, at least on my platform (Ubuntu 14.04), the
> symbols are loaded regardless.  I changed my test so:
> >>> 1. main() just returns right away
> >>> 2. cmdline is: lldb -b -o run /path/to/my/binary
> >>>
> >>> and it takes the same amount of time as setting a breakpoint.
> >>>
> >>> On Wed, Apr 26, 2017 at 5:00 PM, Jim Ingham <jingham at apple.com> wrote:
> >>>
> >>> We started out with the philosophy that lldb wouldn't touch any more
> information in a shared library than we actually needed.  So when a library
> gets loaded we might need to read in and resolve its section list, but we
> won't read in any symbols if we don't need to look at them.  The idea was
> that if you did "load a binary, and run it" until the binary stops for some
> reason, we haven't done any unnecessary work.  Similarly, if all the
> breakpoints the user sets are scoped to a shared library then there's no
> need for us to read any symbols for any other shared libraries.  I think
> that is a good goal, it allows the debugger to be used in special purpose
> analysis tools w/o forcing it to pay costs that a more general purpose
> debug session might require.
> >>>
> >>> I think it would be hard to convert all the usages of modules to from
> "do something with a shared library" mode to "tell me you are interested in
> a shared library and give me a callback" so that the module reading could
> be parallelized on demand.  But at the very least we need to allow a mode
> where symbol reading is done lazily.
> >>>
> >>> The other concern is that lldb keeps the modules it reads in a global
> cache, shared by all debuggers & targets.  It is very possible that you
> could have two targets or two debuggers each with one target that are
> reading in shared libraries simultaneously, and adding them to the global
> cache.  In some of the uses that lldb has under Xcode this is actually very
> common.  So the task pool will have to be built up as things are added to
> the global shared module cache, not at the level of individual targets
> noticing the read-in of a shared library.
> >>>
> >>> Jim
> >>>
> >>>
> >>>
> >>>> On Apr 26, 2017, at 4:12 PM, Scott Smith via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >>>>
> >>>> After a dealing with a bunch of microoptimizations, I'm back to
> parallelizing loading of shared modules.  My naive approach was to just
> create a new thread per shared library.  I have a feeling some users may
> not like that; I think I read an email from someone who has thousands of
> shared libraries.  That's a lot of threads :-)
> >>>>
> >>>> The problem is loading a shared library can cause downstream
> parallelization through TaskPool.  I can't then also have the loading of a
> shared library itself go through TaskPool, as that could cause a deadlock -
> if all the worker threads are waiting on work that TaskPool needs to run on
> a worker thread.... then nothing will happen.
> >>>>
> >>>> Three possible solutions:
> >>>>
> >>>> 1. Remove the notion of a single global TaskPool, but instead have a
> static pool at each callsite that wants it.  That way multiple paths into
> the same code would share the same pool, but different places in the code
> would have their own pool.
> >>>>
> >>>> 2. Change the wait code for TaskRunner to note whether it is already
> on a TaskPool thread, and if so, spawn another one.  However, I don't think
> that fully solves the issue of having too many threads loading shared
> libraries, as there is no guarantee the new worker would work on the
> "deepest" work.  I suppose each task would be annotated with depth, and the
> work could be sorted in TaskPool though...
> >>>>
> >>>> 3. Leave a separate thread per shared library.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> _______________________________________________
> >>>> lldb-dev mailing list
> >>>> lldb-dev at lists.llvm.org
> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >>>
> >>>
> >>
> >> _______________________________________________
> >> lldb-dev mailing list
> >> lldb-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170427/a6fefd6b/attachment.html>

More information about the lldb-dev mailing list