[lldb-dev] Process attach and pgid of forked process under Linux
Andrew MacPherson
andrew.macp at gmail.com
Thu Mar 27 01:01:18 PDT 2014
Thanks Todd, and no problem on the unit test, I'll try to include one for
anything I come across in the future as well.
On Wed, Mar 26, 2014 at 10:39 PM, Todd Fiala <tfiala at google.com> wrote:
> Ah nice. I'll definitely run this again soon. Thanks for getting a test
> around it - we can't have enough of that around important expectations.
>
>
> On Wed, Mar 26, 2014 at 2:06 PM, Andrew MacPherson <andrew.macp at gmail.com>wrote:
>
>> Ok that makes sense. I've updated the patch I sent yesterday with a unit
>> test that exposes the problem as well as a couple of other process attach
>> issues that were fixed in the last few days. It was in fact this getpgid()
>> issue that was causing problems when trying to create a unit test earlier
>> as the problem scenario occurs when the unit test script forks a subprocess
>> with popen().
>>
>>
>> On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <gclayton at apple.com> wrote:
>>
>>>
>>> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson <andrew.macp at gmail.com>
>>> wrote:
>>>
>>> > Ok great, I'm attaching a patch here that just uses getpgid() to
>>> determine the pgid to use with waitpid(). I get the same unit test results
>>> before and after the patch (there are a few tests that fail consistently
>>> for me). I tried using a simple -1 with waitpid() but this results in a
>>> hang in the unit tests. This is probably something worth investigating but
>>> for now this patch does resolve the issues around waitpid() when attaching.
>>>
>>> We should never use -1 with waitpid() as you could end up reaping a
>>> process that someone else is already trying to reap. You must only try to
>>> reap the process that you launched using some pid for you process. Many
>>> things launch processes within LLDB like ProcessGDBRemote launching
>>> "debugserver" binaries. The ProcessGDBRemote must be notified when and if
>>> "debugserver" crashes, so if anyone else does waitpid(-1, ...)
>>> ProcessGDBRemote might never find out if debugserver crashes for some
>>> reason.
>>>
>>> > I mentioned a few issues I ran into with the unit tests in the email I
>>> just pinged with a Linux detach patch, basically there appears to be a bug
>>> or limitation somewhere in that doing a "continue" in asynchronous mode in
>>> a unit test breaks things and this is needed for most attach-related debug
>>> testing. I can probably look into this (along with the couple of failing
>>> tests I have) but likely won't be able to this week. I haven't filed a bug
>>> for the consistently-failing tests on my end since it doesn't appear to be
>>> happening for others and I haven't had a chance to investigate here, let me
>>> know if I should file a bug anyway.
>>> >
>>> >
>>> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <tfiala at google.com> wrote:
>>> > I'd suggest trying the change, running the tests, and see if anything
>>> new fails.
>>> >
>>> > And if this fixes currently broken behavior, add a test to make sure
>>> we don't regress it. I'm pretty sure if it breaks something we'll know
>>> pretty quickly (either via bugs or our own usage). At which point - we
>>> should add a test so we don't regress in the future and perhaps add a few
>>> comments as to the reason.
>>> >
>>> >
>>> > On Tue, Mar 25, 2014 at 9:54 AM, <jingham at apple.com> wrote:
>>> > Why are you waiting for process groups? That's not something we have
>>> to do on Mac OS X.
>>> >
>>> > Jim
>>> >
>>> >
>>> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson <andrew.macp at gmail.com>
>>> wrote:
>>> >
>>> > > Currently under Linux if you attach to a process whose process group
>>> id is not equal to its process id (such as the child process of a fork()
>>> call) the calls to waitpid() that pass -1*pid will return ECHILD since the
>>> pid argument refers to a process group that doesn't exist. These calls
>>> occur in Host::MonitorChildProcessThreadFunction() and the Linux
>>> ProcessMonitor.
>>> > >
>>> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the
>>> issue but it's not clear if this is the right fix as I'm unsure how other
>>> OSes deal with this scenario.
>>> > >
>>> > > Any thoughts?
>>> > >
>>> > > Thanks,
>>> > > Andrew
>>> > > _______________________________________________
>>> > > lldb-dev mailing list
>>> > > lldb-dev at cs.uiuc.edu
>>> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>> >
>>> > _______________________________________________
>>> > lldb-dev mailing list
>>> > lldb-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>> >
>>> >
>>> >
>>> > --
>>> > Todd Fiala | Software Engineer | tfiala at google.com |
>>> 650-943-3180
>>> >
>>> >
>>> > <waitpid-getpgid.patch>_______________________________________________
>>> > lldb-dev mailing list
>>> > lldb-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>
>>>
>>
>
>
> --
> Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140327/4af72e1b/attachment.html>
More information about the lldb-dev
mailing list