[lldb-dev] Process attach and pgid of forked process under Linux
Todd Fiala
tfiala at google.com
Tue Apr 1 09:06:40 PDT 2014
Having a look at this now.
On Thu, Mar 27, 2014 at 1:01 AM, Andrew MacPherson <andrew.macp at gmail.com>wrote:
> Thanks Todd, and no problem on the unit test, I'll try to include one for
> anything I come across in the future as well.
>
>
> On Wed, Mar 26, 2014 at 10:39 PM, Todd Fiala <tfiala at google.com> wrote:
>
>> Ah nice. I'll definitely run this again soon. Thanks for getting a test
>> around it - we can't have enough of that around important expectations.
>>
>>
>> On Wed, Mar 26, 2014 at 2:06 PM, Andrew MacPherson <andrew.macp at gmail.com
>> > wrote:
>>
>>> Ok that makes sense. I've updated the patch I sent yesterday with a unit
>>> test that exposes the problem as well as a couple of other process attach
>>> issues that were fixed in the last few days. It was in fact this getpgid()
>>> issue that was causing problems when trying to create a unit test earlier
>>> as the problem scenario occurs when the unit test script forks a subprocess
>>> with popen().
>>>
>>>
>>> On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <gclayton at apple.com>wrote:
>>>
>>>>
>>>> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson <andrew.macp at gmail.com>
>>>> wrote:
>>>>
>>>> > Ok great, I'm attaching a patch here that just uses getpgid() to
>>>> determine the pgid to use with waitpid(). I get the same unit test results
>>>> before and after the patch (there are a few tests that fail consistently
>>>> for me). I tried using a simple -1 with waitpid() but this results in a
>>>> hang in the unit tests. This is probably something worth investigating but
>>>> for now this patch does resolve the issues around waitpid() when attaching.
>>>>
>>>> We should never use -1 with waitpid() as you could end up reaping a
>>>> process that someone else is already trying to reap. You must only try to
>>>> reap the process that you launched using some pid for you process. Many
>>>> things launch processes within LLDB like ProcessGDBRemote launching
>>>> "debugserver" binaries. The ProcessGDBRemote must be notified when and if
>>>> "debugserver" crashes, so if anyone else does waitpid(-1, ...)
>>>> ProcessGDBRemote might never find out if debugserver crashes for some
>>>> reason.
>>>>
>>>> > I mentioned a few issues I ran into with the unit tests in the email
>>>> I just pinged with a Linux detach patch, basically there appears to be a
>>>> bug or limitation somewhere in that doing a "continue" in asynchronous mode
>>>> in a unit test breaks things and this is needed for most attach-related
>>>> debug testing. I can probably look into this (along with the couple of
>>>> failing tests I have) but likely won't be able to this week. I haven't
>>>> filed a bug for the consistently-failing tests on my end since it doesn't
>>>> appear to be happening for others and I haven't had a chance to investigate
>>>> here, let me know if I should file a bug anyway.
>>>> >
>>>> >
>>>> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <tfiala at google.com>
>>>> wrote:
>>>> > I'd suggest trying the change, running the tests, and see if anything
>>>> new fails.
>>>> >
>>>> > And if this fixes currently broken behavior, add a test to make sure
>>>> we don't regress it. I'm pretty sure if it breaks something we'll know
>>>> pretty quickly (either via bugs or our own usage). At which point - we
>>>> should add a test so we don't regress in the future and perhaps add a few
>>>> comments as to the reason.
>>>> >
>>>> >
>>>> > On Tue, Mar 25, 2014 at 9:54 AM, <jingham at apple.com> wrote:
>>>> > Why are you waiting for process groups? That's not something we have
>>>> to do on Mac OS X.
>>>> >
>>>> > Jim
>>>> >
>>>> >
>>>> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson <andrew.macp at gmail.com>
>>>> wrote:
>>>> >
>>>> > > Currently under Linux if you attach to a process whose process
>>>> group id is not equal to its process id (such as the child process of a
>>>> fork() call) the calls to waitpid() that pass -1*pid will return ECHILD
>>>> since the pid argument refers to a process group that doesn't exist. These
>>>> calls occur in Host::MonitorChildProcessThreadFunction() and the Linux
>>>> ProcessMonitor.
>>>> > >
>>>> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the
>>>> issue but it's not clear if this is the right fix as I'm unsure how other
>>>> OSes deal with this scenario.
>>>> > >
>>>> > > Any thoughts?
>>>> > >
>>>> > > Thanks,
>>>> > > Andrew
>>>> > > _______________________________________________
>>>> > > lldb-dev mailing list
>>>> > > lldb-dev at cs.uiuc.edu
>>>> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>> >
>>>> > _______________________________________________
>>>> > lldb-dev mailing list
>>>> > lldb-dev at cs.uiuc.edu
>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Todd Fiala | Software Engineer | tfiala at google.com |
>>>> 650-943-3180
>>>> >
>>>> >
>>>> > <waitpid-getpgid.patch>_______________________________________________
>>>> > lldb-dev mailing list
>>>> > lldb-dev at cs.uiuc.edu
>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>
>>>>
>>>
>>
>>
>> --
>> Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
>>
>
>
--
Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140401/651dee71/attachment.html>
More information about the lldb-dev
mailing list