[lldb-dev] Linux/Posix spawning, signal masking
Todd Fiala
tfiala at google.com
Wed Jan 22 15:04:36 PST 2014
(And I'll adjust the comment).
On Wed, Jan 22, 2014 at 3:04 PM, Todd Fiala <tfiala at google.com> wrote:
> Thanks. It tested fine on Ubuntu, moving over to OS X.
>
>
> On Wed, Jan 22, 2014 at 2:50 PM, Greg Clayton <gclayton at apple.com> wrote:
>
>> Looks good except the "// No more thread specific current working
>> directory" comment when about to call __pthread_chdir() should be changed
>> to:
>>
>> // Set the working directory on this thread only
>>
>>
>> On Jan 22, 2014, at 1:45 PM, Todd Fiala <tfiala at google.com> wrote:
>>
>> > So I just about have this done (the original 3 steps, not switching to
>> fork()/exec()).
>> >
>> > One thing I hit was an area I noticed yesterday: on Linux (x86_64,
>> Ubuntu 12.04), I'm finding that I need to *not* take defaults for all the
>> signal handlers in lldb-gdbserver; otherwise, the target process never
>> starts operating. Technically the target process is exiting immediately
>> with an exit_state of 127 with signal = 0.
>> >
>> > For the moment, I have that #ifdef'd in the Apple/Linux/FreeBSD
>> LaunchProcessPosixSpawn method so that only Linux doesn't reset the child
>> signal handlers to defaults. I'm thinking the FreeBSD/Apple BSD lineage
>> might behave more similarly here as my first stake in the sand for a guess,
>> but that would be changing behavior for FreeBSD. (Ed - any thoughts here?
>> I could just as well flip the #if defined(__linux__) to #if
>> !defined(__APPLE__).) Ultimately the root cause of this is not clear to me
>> yet.
>> >
>> > For the directory handling, I'm having that handled via __pthread*
>> methods on Apple and continuing to do what it did before on everything
>> else. I'll look into the fork()/exec() option as a follow up after I work
>> through some of the more obviously broken bits of lldb-gdbserver.
>> >
>> > See below for the current state of the patch. I'm about to run tests
>> on Ubuntu and try a build and tests on my MBP.
>> >
>> > -Todd
>> >
>> >
>> > On Wed, Jan 22, 2014 at 10:24 AM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> > You might need to add "#if defined(__APPLE__)" around the calls to
>> __pthread_chdir and __pthread_fchdir. Those might be darwin specific.
>> >
>> > There is no way to set your current working directory when using
>> posix_spawn, so to work around this, we change directory in a thread
>> specific way by calling __pthread_chdir(working_dir) and reverting it using
>> __pthread_fchdir(-1), so as to not hose over the process that has the lldb
>> shared library loaded by changing the global working directory.
>> >
>> > So in order to eventually fix being able to obey the launch flag
>> eLaunchFlagDebug and also to be able to set the working directory, non
>> darwin builds might need to change over to using fork()/exec() since after
>> the fork the child process can set its working directory and also call
>> ptrace() to start being debuggable and then calling exec(). When ptrace()
>> has been called to allow the process to be debugged, I believe it stops
>> right at the entry point upon exec. Again, this would be for non-darwin
>> builds since Apple has a posix_spawn flag of POSIX_SPAWN_START_SUSPENDED
>> that can be set to enforce stopping at the entry point and also has the
>> thread specific way to change directory.
>> >
>> > Greg
>> >
>> >
>> > On Jan 22, 2014, at 10:00 AM, Todd Fiala <tfiala at google.com> wrote:
>> >
>> > > Correct.
>> > >
>> > >
>> > > On Wed, Jan 22, 2014 at 9:55 AM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> > > So the fix will be:
>> > > 1 - Remove LaunchProcessPosixSpawn() from Host.mm
>> > > 2 - Copy it over into Host.cpp
>> > > 3 - Make sure LaunchProcessPosixSpawn is compiled for Apple builds as
>> well in Host.cpp
>> > >
>> > > Right?
>> > >
>> > >
>> > > On Jan 22, 2014, at 9:54 AM, Todd Fiala <tfiala at google.com> wrote:
>> > >
>> > > > > It is indeed the right fix.
>> > > >
>> > > > Ok - I'll take care of that part. Thanks!
>> > > >
>> > > >
>> > > > On Wed, Jan 22, 2014 at 9:52 AM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> > > > It is indeed the right fix. Host.mm has a mac specific version
>> which does this:
>> > > >
>> > > > sigset_t no_signals;
>> > > > sigset_t all_signals;
>> > > > sigemptyset (&no_signals);
>> > > > sigfillset (&all_signals);
>> > > > ::posix_spawnattr_setsigmask(&attr, &no_signals);
>> > > > ::posix_spawnattr_setsigdefault(&attr, &all_signals);
>> > > >
>> > > > Compared to the incorrect version you noticed in Host.cpp:
>> > > >
>> > > > sigset_t no_signals;
>> > > > sigset_t all_signals;
>> > > > sigemptyset (&no_signals);
>> > > > sigfillset (&all_signals);
>> > > > ::posix_spawnattr_setsigmask(&attr, &all_signals);
>> > > > ::posix_spawnattr_setsigdefault(&attr, &no_signals);
>> > > >
>> > > > (notice all_signals and no_signals are reversed in the last two
>> function calls in Host.cpp.
>> > > >
>> > > > You might compare the "LaunchProcessPosixSpawn" in Host.mm and try
>> and just copy it to Host.cpp and see if it works. I can't find anything
>> darwin specific in the code after a quick glance and we have used the
>> Host.mm LaunchProcessPosixSpawn() for a few years now, so it is definitely
>> tested...
>> > > >
>> > > > So the fix would seem to be:
>> > > > 1 - Remove LaunchProcessPosixSpawn() from Host.mm
>> > > > 2 - Copy it over into Host.cpp
>> > > > 3 - Make sure LaunchProcessPosixSpawn is compiled for Apple builds
>> as well in Host.cpp
>> > > >
>> > > >
>> > > > On Jan 21, 2014, at 10:49 PM, Todd Fiala <tfiala at google.com> wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > In source/Host/common/Host.cpp, there is a posix process spawning
>> method called LaunchProcessPosixSpawn. On my Linux system, it is used to
>> start the target process in lldb-gdbserver. It looks like FreeBSD uses it
>> as well. Most of the Platform launchers appear to funnel to it as well
>> (via Host::LaunchProcess ()).
>> > > > >
>> > > > > There is a section of code in LaunchProcessPosixSpawn that masks
>> all signals for the child process that is started up:
>> > > > >
>> > > > > ::posix_spawnattr_setsigmask(&attr, &all_signals)
>> > > > >
>> > > > > When that is set, it seems to be preventing the child from
>> receiving everything except the non-blockable signals. This has the effect
>> of (at the very least) blocking SIGINT (i.e. ^C from the keyboard) and
>> SIGTERM (standard unadorned "kill pid"). This appears to be the cause of a
>> few bugs as I try to get lldb-gdbserver working on Linux. For example:
>> > > > >
>> > > > > 1. hitting ^C on an lldb-gdbserver that spawned a debuggee target
>> process would kill the lldb-gdbserver, but not the debuggee target, which
>> continues to run.
>> > > > >
>> > > > > 2. sending a SIGTERM (kill {debuggee pid}) while it is getting
>> debugged and running is ignored.
>> > > > >
>> > > > > 3. sending a SIGTERM to the debuggee after issue #1 (where
>> lldb-gdbserver is no longer running) is ignored.
>> > > > >
>> > > > > 4. killing lldb-gdbserver from an attached lldb that shuts down
>> and kills the remote does indeed kill lldb-gebserver, but does not kill the
>> target process.
>> > > > >
>> > > > > (Of course sending a 'kill -9 {debuggee pid}' works fine to kill
>> the target process).
>> > > > >
>> > > > >
>> > > > > I've modified this method locally to not mask any signals:
>> > > > >
>> > > > > ::posix_spawnattr_setsigmask (&attr, &no_signals)
>> > > > >
>> > > > > This seems to address all the problems I had above for
>> lldb-gdbserver as signals propagate properly, and the target responds
>> correctly to signals sent when the parent dies, etc.
>> > > > >
>> > > > > I've also run all the existing tests (on my system, that amounts
>> to 275 tests that really run), and those are all passing.
>> > > > >
>> > > > > However, given that so much code flows through here (or at least
>> appears to) that is not directly related to the lldb-gdbserver --- i.e.
>> local linux/FreeBSD debugging --- I'm highly skeptical that this is the
>> right fix. I did try using local debugging with lldb with my change in
>> place, and that seemed to work fine. But I'm thinking that perhaps the
>> signal blocking was intended and that behavior is needed in some cases that
>> (perhaps) are not covered by tests that run on Linux.
>> > > > >
>> > > > > Any thoughts on why all the signals were getting masked on
>> process spawning? Does that change look okay as is?
>> > > > >
>> > > > > Thanks!
>> > > > >
>> > > > > Sincerely,
>> > > > > Todd Fiala
>> > > > > _______________________________________________
>> > > > > lldb-dev mailing list
>> > > > > lldb-dev at cs.uiuc.edu
>> > > > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Todd Fiala | Software Engineer | tfiala at google.com |
>> 650-943-3180
>> > > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Todd Fiala | Software Engineer | tfiala at google.com |
>> 650-943-3180
>> > >
>> >
>> >
>> >
>> >
>> > --
>> > Todd Fiala | Software Engineer | tfiala at google.com |
>> 650-943-3180
>> >
>> > <posix_spawn_2014-01-22_01.diff>
>>
>>
>
>
> --
> Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
>
--
Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140122/9f9f09b5/attachment.html>
More information about the lldb-dev
mailing list