[lldb-dev] Dlopen extremely slow while LLDB is attached

Mon Apr 30 12:13:32 PDT 2018

I messed up my test setup and incorrectly attributed the additional time.
Parsing the DWARF sections does add additional time, but only increases by
a small percentage.

By pure chance I noticed that the latency between the client and server had
a huge impact. I did some tests against lldb_server running on a machine
with a RTT of 18ms to the client. The difference in load time in my real
world example (initializing Vulkan) was around 16x slower (~3.5s vs 55s). I
did some more digging and it looks like there are certain operations that
perform a number of synchronous requests to the
server (DYLDRendezvous::TakeSnapshot - updating SO entries and
ThreadList::WillResume - grabbing SP and FP for every thread). Since all
the requests are synchronous they are extremely sensitive to increased
latency.

Is this a limitation of the gdb-server (can't handle parallel requests)?
Or is this not a common use case, and is not a known issue?

> Hmm... that is strange. Could you also enable the "gdb-remote packets" log
> so we can see what is timing out.

I enabled "gdb-remote all" logging, and searched for all all instances
of '0x00005652ACF3F120'
(the address of the connection object that is reporting the timeout?).
Seems to be a pretty good corelation between the timeouts and
'Communication::SyncronizeWithReadThread',
unfortunately I haven't had time to investigate further.

1525110345.916504145 0x5652acefcbb0
Broadcaster("lldb.process")::RestoreBroadcaster (about to pop
listener("lldb.PlatformLinux.DebugProcess.hijack")=0x5652acf079a0)
1525110345.916502953 this = 0x00005652ACF3F120, timeout = 5000000 us
--
1525110345.919557333 0x7f0868000940
'Communication::SyncronizeWithReadThread'
Listener::FindNextEventInternal(broadcaster=(nil),
broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event
0x7f086c0008c0
525110345.919566154 this = 0x00005652ACF3F120, timeout = 5000000 us
--
1525110346.123922110 0x7f0868000d10
'Communication::SyncronizeWithReadThread'
Listener::FindNextEventInternal(broadcaster=(nil),
broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event
0x7f086c0008c0
1525110346.123931408 this = 0x00005652ACF3F120, timeout = 5000000 us
--
1525110346.152676821 0x7f0868006710
'Communication::SyncronizeWithReadThread'
Listener::FindNextEventInternal(broadcaster=(nil),
broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event
0x7f086c0008c0
1525110346.152685642 this = 0x00005652ACF3F120, timeout = 5000000 us
--
1525110346.167683363 0x7f08682b2fe0
'Communication::SyncronizeWithReadThread'
Listener::FindNextEventInternal(broadcaster=(nil),
broadcaster_names=(nil)[0], event_type_mask=0x00000000, remove=1) event
0x7f086c0008c0
1525110346.167692184 this = 0x00005652ACF3F120, timeout = 5000000 us
--
1525110351.172777176 error: timed out, status = timed out, uri =
1525110351.172847271 this = 0x00005652ACF3F120, timeout = 5000000 us
1525110356.173308611 error: timed out, status = timed out, uri =
1525110356.173368216 this = 0x00005652ACF3F120, timeout = 5000000 us
1525110361.175591230 error: timed out, status = timed out, uri =
1525110361.175647497 this = 0x00005652ACF3F120, timeout = 5000000 us
1525110366.180710316 error: timed out, status = timed out, uri =
1525110366.180769205 this = 0x00005652ACF3F120, timeout = 5000000 us

On Wed, Apr 25, 2018 at 3:19 PM Pavel Labath <labath at google.com> wrote:

> On Wed, 25 Apr 2018 at 19:59, Scott Funkenhauser via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>
> > Thanks Greg and Jason for pointing me in the right direction!
>
> > The results from my original tests were due to a timeout being triggered:
>
> > 1524681484.092825651 (x86_64) /tmp/liblldb.so.7.0.0: Reading EH frame
> info
> > 1524681489.783293724 error: timed out, status = timed out
> > 1524681489.783379793 this = 0x00005555573DDC80, timeout = 5000000 us
> > 1524681489.840902615 Breakpoint::ModulesChanged: num_modules: 1 load: 1
> delete_locations: 0
>
> Hmm... that is strange. Could you also enable the "gdb-remote packets" log
> so we can see what is timing out.
>
>
> > I ran the exact same setup on a few other machines, and none of them hit
> the timeout (so not sure what happened with my first test, more
> investigation is required). Without the timeout, loading times of a
> stripped shared library with LLDB attached was much more reasonable.
>
> > I then ran another test on a shared library that had DWARF sections. I
> tracked down the majority of the extra time to be taking place inside
> SymbolFileDWARF::Index(), which confirms your theory of the additoinal time
> being due to indexing the DWARF sections.
>
> > I found this message (
> http://lists.llvm.org/pipermail/llvm-dev/2018-January/120509.html) which
> mentioned adding DWARF5 accelerator table support. Is it reasonable to
> assume once that is implemented the time it takes to index DWARF sections
> should be drastically reduced?
>
> Yes, that is the idea. I got a bit side-tracked the last two weeks, but I
> should be back on that project soon. The current state is that the compiler
> support is mostly in place, and now I need to add debugger support for
> this. (Maybe it's obvious, but you will only get the speed up if you use a
> compiler which supports this feature, which will be clang-7 at the
> earliest, and you will probably need to pass an extra flag to enable it)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20180430/03da10ec/attachment.html>