[lldb-dev] [llvm-dev] lldb stops on every call to dlopen

Pavel Labath via lldb-dev lldb-dev at lists.llvm.org
Wed Apr 18 04:14:39 PDT 2018


+Eugene, who wrote the signal-skipping code

Cool, I'm glad we got to the bottom of this.

At this point, I *think* we should be able to reproduce the issue
ourselves, but a self-contained reproducer would definitely make it easier
for us and would be appreciated, particularly as there are still a few
things I don't understand here (e.g., I think this should only matter only
after an instruction-step, but your last log seems to show it happens after
a "continue").

In any case, I do have a couple of follow-up questions:
- How are you sending the SIGALRM and with what frequency?
- Are you using SIGALRM on macOS as well (implying we handle it correctly
there)?

Also, you may know this already, but I feel obligated to point out that
setting "--pass false" will change the behavior of your program.

cheers,
pavel
On Tue, 17 Apr 2018 at 20:59, Steve Ravet <steve.ravet at apple.com> wrote:

> Ding!  That’s it.  This program does use SIGALRM.  I wouldn’t have
thought it would be enabled at this point, but apparently it is because I
got lots of sigalarm halts during the .so loading.

> Further, if I run in lldb with

> process handle -n false -p false -s false SIGALRM

> then the debugger seems to run fine without stopping during dlopen().
The SIGALRM itself isn’t important to the operation of the program.

> With this knowledge I can probably create a simple testcase.  Would this
be considered an lldb bug?  If so should I submit a testcase in some way?

> thanks!
> --steve

>  The lover of nature is he whose inward and outward senses are still
truly adjusted to each other; who has retained the spirit of infancy even
into the era of manhood.



> On Apr 17, 2018, at 1:12 PM, Pavel Labath <labath at google.com> wrote:

> It's a bit of a wild guess, but is it possible that you (or one of the
> libraries you use) are doing anything with signals (SIGALRM or such?). I
> think I remember looking at the code handling the server-side ignored
> signal handling and thinking that it could go wrong if you get a signal
> while doing a instruction-step. I am not sure it fully applies here as the
> last command that lldb client did was a "continue", but i think it has to
> have something to do with signals, as you end up stopped in a signal
> handler.

> Could you try the following sequence of commands?
> (lldb) process launch --stop-at-entry-point
> (lldb) process handle --notify true --stop true #Stop on all signals
> (lldb) continue

> and let us know if you see any extra stops due to signals. If that doesn't
> find anything then I think we'll have start pulling logs from the
> lldb-server side, as there doesn't seem to be anything wrong with the
> client. The easiest way to achieve that is to do a
> export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets"
> export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log
> before launching lldb.
> On Tue, 17 Apr 2018 at 18:28, Steve Ravet <steve.ravet at apple.com> wrote:

> Pavel asked for a dump of gdb-remote commands.  I got that and ran it

> through the gdbremote decoder, and trimmed to include what looks like the
> last successful continue after breakpoint and then the halt on dlopen.
> Both cases stop on signal 5.

> After the stop message the debugger issues two binary reads and then

> apparently makes the decision that it should stop rather than continue.
> The stopping case is missing the equivalent of "Element 1: Single stepping
> past breakpoint site 2 at 0x2aaaaaab9eb0” which is in the continuing case.
> I’ve attached the file here:


> thanks,
> --steve



>  From the ashes of disaster grow the roses of success.




> On Apr 17, 2018, at 11:27 AM, Jim Ingham <jingham at apple.com> wrote:


> It is interesting that the stop reason on the thread that stopped is

> "trace".  That's what you would expect returning from the single-step to
> step over the breakpoint.  But it looks like we got a signal while
> single-stepping, but the stop reason was misreported by somebody.

> Jim



> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev <

> lldb-dev at lists.llvm.org> wrote:

> [+lldb-dev]


> Hello Steve,


> thanks for the report.


> The fact that you see the rendezvous breakpoint being hit many times is

> not

> surprising. We get those every time the library is loaded (we need that to
> load relevant debug info and set potential breakpoints). However, they
> should generally not be surfaced to the user (unless you have the
> stop-on-sharedlibrary-events setting set, which you don't).


> The part that is suspicious to me is that __restore_rt shows up on the top
> of the backtrace. This is a trampoline used to return from signal

> handlers,

> and it would seem to indicate that you got some sort of a signal while
> loading the libraries. I don't know why this would happen, but it could be
> that this is confusing lldb's auto-resume logic.


> The interesting part to see here is what lldb thinks are the stop reasons
> for individual threads in the process (is the process multi-threaded?) for
> the last couple of stops. The "lldb step" and "gdb-remote packets" log
> categories are the most interesting to observe here. If you are able to
> send me the log traces, I can help you interpret them.


> regards,
> pavel






> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:


> Hello lldb developers, I am running into a problem with lldb on Linux.  I


> am currently running llvm 6.0.0.


> I have an executable that dynamically loads a large number of shared


> libraries at runtime.  These are explicitly loaded via dlopen (they are
> specified in a configuration file), and after loading a few (typically a
> dozen or so, but the number varies) lldb will halt during dlopen.  If I
> continue, it will load a few more then halt again, which makes debugging
> from startup impractical since there are so many libraries to be loaded
> (more than a hundred of them).


> When I build and debug this same C++ on macOS, the debugger works fine.


> I have verified that target.process.stop-on-sharedlibrary-events is false.
> I turned on dyld logging and I see lots of log messages about
> RendezvousBreakpoint being hit, but I don’t see anything that sheds light
> on why some libraries load without stopping but others don’t.


> I have tried to recreate this in a trivial program that calls dlopen in a


> loop, but haven’t been able to reproduce.


> Can your offer any suggestions for further debugging this?  More


> supporting evidence follows.


> Here is the message when the debugger stops:



> Process 120004 stopped
> * thread #1, name = ‘xxxxxxxx', stop reason = trace
>     frame #0: 0x00002aaaacfca6a0 libc.so.6`__restore_rt
> libc.so.6`__restore_rt:
> ->  0x2aaaacfca6a0 <+0>: movq   $0xf, %rax
>     0x2aaaacfca6a7 <+7>: syscall
>     0x2aaaacfca6a9 <+9>: nopl   (%rax)



> libc.so.6`__libc_sigaction:
>     0x2aaaacfca6b0 <+0>: subq   $0xd0, %rsp



> I do not have the stop on shared library events setting enabled:



> (lldb) settings show target.process.stop-on-sharedlibrary-events
> target.process.stop-on-sharedlibrary-events (boolean) = false





> The backtrace goes back to dlopen:



> (lldb) bt
> * thread #1, name = ‘xxxxx', stop reason = trace
>   * frame #0: 0x00002aaaacfca6a0 libc.so.6`__restore_rt
>     frame #1: 0x00002aaaaaab9eb0 ld-linux-x86-64.so.2
>     frame #2: 0x00002aaaaaabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499
>     frame #3: 0x00002aaaaaab9286 ld-linux-x86-64.so.2`_dl_catch_error +


> 102


>     frame #4: 0x00002aaaaaabd63a ld-linux-x86-64.so.2`_dl_open + 186
>     frame #5: 0x00002aaaac39df66 libdl.so.2`dlopen_doit + 102
>     frame #6: 0x00002aaaaaab9286 ld-linux-x86-64.so.2`_dl_catch_error +


> 102


>     frame #7: 0x00002aaaac39e29c libdl.so.2`_dlerror_run + 124
>     frame #8: 0x00002aaaac39dee1 libdl.so.2`__dlopen_check + 49



> the dyld debug log has a lot of this:
> 209 intern-state     DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid


> 153501 stop_when_images_change=false


> 210 intern-state     DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit


> called for pid 153501


> 211 intern-state     DYLDRendezvous::Resolve address size: 8, padding 4
> 212 intern-state     DYLDRendezvous::Resolve cursor = 0x2aaaaaccc160
> 213 intern-state     DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid


> 153501 stop_when_images_change=false


> 214 intern-state     DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit


> called for pid 153501


> 215 intern-state     DYLDRendezvous::Resolve address size: 8, padding 4
> 216 intern-state     DYLDRendezvous::Resolve cursor = 0x2aaaaaccc160





> thanks,
> --steve




>  In the woods too, a man casts off his years, as the snake his slough,


> and at what period soever of life, is always a child.




> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


More information about the lldb-dev mailing list