[lldb-dev] Breakpoint + callback performance ... Can it be faster?

Tue Aug 16 13:43:10 PDT 2016

On Tue, Aug 16, 2016 at 11:06 AM, Jim Ingham <jingham at apple.com> wrote:

>
> > On Aug 16, 2016, at 10:42 AM, Benjamin Dicken <
> bddicken at datawareventures.com> wrote:
> >
> > Thanks for the quick reply.
> >
> > > Are you sure the actual handling of the breakpoint & callback in lldb
> is what is taking most of the time?
> >
> > I'm not positive. I did collect some callgrind profiles to take a look
> at where most of the time is being spent, but i'm not very familiar with
> lldb internals so the results were hard to interpret. I did notice that
> there was a lot of packet/network business when using lldb to profile a
> program (which I assumed was communication between my program and
> lldb-server). I was not sure how this effected the performance, so perhaps
> this is the real bottleneck.
>
> I would be pretty surprised if it was not.  We had some bugs in breakpoint
> handling - mostly related to having very very many breakpoints.  But other
> than that the dispatching of the breakpoint StopInfo is a pretty simple,
> straight forward bit of work.
>
> >
> > > Greg just switched to using a unix-domain socket for this
> communication for platforms that support it.  This speeds up the packet
> traffic side of things.
> >
> > In what version of lldb was this introduced? I'm running 3.7.1. I'm also
> on ubuntu 14.04, is that a supported platform?
>
> It is just in TOT lldb, he just added it last week.  It is currently only
> turned on for OS X.
>

Good to know, thanks.

>
> >
> > > One of the original motivations of having lldb-server be based on lldb
> classes - as opposed to the MacOS X version of debugserver which is an
> independent construct - was that you could re-use the server code to create
> an in-process Process plugin, eliminating a lot of this traffic & context
> switching when you needed maximum speed.
> >
> > That sounds very interesting. Is there an example of this implementation
> you could point me to?
> >
>
> FreeBSB & Windows still have native Process plugins.  But they aren't used
> for the lldb-server implementation so far as I can tell (I've mostly worked
> on the OS X side.)  I think this was more of a design intent that hasn't
> actually been used anywhere yet.  But the Linux/Android folks will know
> better.
>

If any of the Linux/Andriod folks do know, please get in touch with me.
Thanks,

> Jim
>
>
> >
> >
> > On Tue, Aug 16, 2016 at 10:20 AM, Jim Ingham <jingham at apple.com> wrote:
> > Are you sure the actual handling of the breakpoint & callback in lldb is
> what is taking most of the time?  The last time we looked at this, the
> majority of the work was in communicating with debugserver to get the stop
> notification and restart.  Note, besides all the packet code, this involves
> context switches from process->lldbserver->lldb and back, which is also
> pretty expensive.
> >
> > Greg just switched to using a unix-domain socket for this communication
> for platforms that support it.  This speeds up the packet traffic side of
> things.
> >
> > One of the original motivations of having lldb-server be based on lldb
> classes - as opposed to the MacOS X version of debugserver which is an
> independent construct - was that you could re-use the server code to create
> an in-process Process plugin, eliminating a lot of this traffic & context
> switching when you needed maximum speed.  The original Mac OS X lldb port
> actually had a process plugin wholly in-process with lldb as well as the
> debugserver based one, but there wasn't enough motivation to justify
> maintaining the two different implementations of the same code.  I don't
> know whether the Linux port takes advantage of this possibility, however.
> That would be something to look into, however.
> >
> > Once we actually figure out about the stop, figuring out the breakpoint
> and getting to its callback is pretty simple...  I doubt making "lighter
> weight breakpoints" in particular will recover the performance you need,
> though if your sampling turns up some inefficient algorithms have crept in,
> it would be great to fix that.
> >
> > Another option we've toyed with on and off is something like the gdb
> "tracepoints" were you can upload instructions to perform "experiments"
> when a breakpoint is hit to the lldb-server instance.  The work to perform
> the experiment and the results would all be kept in the lldb-server
> instance till a real breakpoint is hit, at which point lldb can download
> all the results and present them to the user.  This would eliminate some of
> the context-switches and packet traffic while you were running in the hot
> parts of your code.  This is a decent chunk of work, however.
> >
> > Jim
> >
> >
> > > On Aug 16, 2016, at 9:57 AM, Benjamin Dicken via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > >
> > > I recently started using lldb to write a basic instrumentation tool
> for tracking the values of variables at various code-points in a program.
> I've been working with lldb for less than two weeks, so I am pretty new.
> Though, I have used and written llvm passes in the past, so I'm familiar
> with the clang/llvm/lldb ecosystem.
> > >
> > > I have a very early prototype of the tool up and running, using the
> C++ API. The user can specify either an executable to run or an
> already-running PID to attach to. The user also supplies a file+line_number
> at which a breakpoint (with a callback) is placed. For testing/prototyping
> purposes, the breakpoint callback just increments a counter and then
> immediately returns false. Eventually, more interesting things will happen
> in this callback.
> > >
> > > I've noticed that just the action of hitting a breakpoint and invoking
> the callback is very expensive. I did some instruction-count collection by
> running this lldb tool on a simple test program, and placing the
> breakpoint+callback at different points in the program, causing it to get
> triggered different amounts of times. I used `perf stat -e instructions
> ...` to gather instruction exec counts for each run. After doing a little
> math, it appears that I'm incurring 1.0 - 1.1 million instruction execs per
> breakpoint.
> > >
> > > This amount of slowdown is prohibitively expensive for my needs,
> because I want to place callbacks in hot portions of the "inferior" program.
> > >
> > > Is there a way to make this faster? Is it possible to create
> "lighter-weight" breakpoints? I really like the lldb API (though the
> documentation is lacking in some places), but if this performance hit can't
> be mitigated, it may be unusable for me.
> > >
> > > For reference, this is the callback function:
> > >
> > > ```
> > > static int cb_count = 0;
> > > bool SimpleCallback (
> > >     void *baton,
> > >     lldb::SBProcess &process,
> > >     lldb::SBThread &thread,
> > >     lldb::SBBreakpointLocation &location) {
> > >   //TODO: Eventually do more interesting things...
> > >   cb_count++;
> > >   return false;
> > > }
> > > ```
> > >
> > > And here is how I set it up to be called back:
> > >
> > > ```
> > > lldb::SBBreakpoint bp1 = debugger_data->target.
> BreakpointCreateByLocation(file_name, line_no);
> > > if (!bp1.IsValid()) std::cerr << "invalid breakpoint";
> > > bp1.SetCallback(SimpleCallback, 0);
> > > ```
> > >
> > > -Benjamin
> > > _______________________________________________
> > > lldb-dev mailing list
> > > lldb-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >
> >
> >
> >
> > --
> > Ben
>
>

-- 
Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160816/8ef1ff95/attachment.html>