[lldb-dev] Process attach and pgid of forked process under Linux

Todd Fiala tfiala at google.com
Tue Apr 1 10:00:08 PDT 2014


I built clean and ran tests error free on r205315.

After testing with the patch applied, I'm getting the following failures (3
times in a row):

FAIL: LLDB (suite) :: TestHelloWorld.py (Linux
tfiala2.mtv.corp.google.com3.2.5-gg1336 #1 SMP Thu Aug 29 02:37:18 PDT
2013 x86_64 x86_64)
FAIL: LLDB (suite) :: TestAttachResume.py (Linux
tfiala2.mtv.corp.google.com3.2.5-gg1336 #1 SMP Thu Aug 29 02:37:18 PDT
2013 x86_64 x86_64)

Looking deeper, it looks like the main.c code in
test/functionalities/attach_resume/main.c is assuming c99 mode which isn't
the default when using gcc.  I'm clearing that locally (pulling for loop
variable declarations out) and will see if that fixes everything else.

Back in a bit...


On Tue, Apr 1, 2014 at 9:06 AM, Todd Fiala <tfiala at google.com> wrote:

> Having a look at this now.
>
>
> On Thu, Mar 27, 2014 at 1:01 AM, Andrew MacPherson <andrew.macp at gmail.com>wrote:
>
>> Thanks Todd, and no problem on the unit test, I'll try to include one for
>> anything I come across in the future as well.
>>
>>
>> On Wed, Mar 26, 2014 at 10:39 PM, Todd Fiala <tfiala at google.com> wrote:
>>
>>> Ah nice.  I'll definitely run this again soon.  Thanks for getting a
>>> test around it - we can't have enough of that around important expectations.
>>>
>>>
>>> On Wed, Mar 26, 2014 at 2:06 PM, Andrew MacPherson <
>>> andrew.macp at gmail.com> wrote:
>>>
>>>> Ok that makes sense. I've updated the patch I sent yesterday with a
>>>> unit test that exposes the problem as well as a couple of other process
>>>> attach issues that were fixed in the last few days. It was in fact this
>>>> getpgid() issue that was causing problems when trying to create a unit test
>>>> earlier as the problem scenario occurs when the unit test script forks a
>>>> subprocess with popen().
>>>>
>>>>
>>>> On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <gclayton at apple.com>wrote:
>>>>
>>>>>
>>>>> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson <andrew.macp at gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Ok great, I'm attaching a patch here that just uses getpgid() to
>>>>> determine the pgid to use with waitpid(). I get the same unit test results
>>>>> before and after the patch (there are a few tests that fail consistently
>>>>> for me). I tried using a simple -1 with waitpid() but this results in a
>>>>> hang in the unit tests. This is probably something worth investigating but
>>>>> for now this patch does resolve the issues around waitpid() when attaching.
>>>>>
>>>>> We should never use -1 with waitpid() as you could end up reaping a
>>>>> process that someone else is already trying to reap. You must only try to
>>>>> reap the process that you launched using some pid for you process. Many
>>>>> things launch processes within LLDB like ProcessGDBRemote launching
>>>>> "debugserver" binaries. The ProcessGDBRemote must be notified when and if
>>>>> "debugserver" crashes, so if anyone else does waitpid(-1, ...)
>>>>> ProcessGDBRemote might never find out if debugserver crashes for some
>>>>> reason.
>>>>>
>>>>> > I mentioned a few issues I ran into with the unit tests in the email
>>>>> I just pinged with a Linux detach patch, basically there appears to be a
>>>>> bug or limitation somewhere in that doing a "continue" in asynchronous mode
>>>>> in a unit test breaks things and this is needed for most attach-related
>>>>> debug testing. I can probably look into this (along with the couple of
>>>>> failing tests I have) but likely won't be able to this week. I haven't
>>>>> filed a bug for the consistently-failing tests on my end since it doesn't
>>>>> appear to be happening for others and I haven't had a chance to investigate
>>>>> here, let me know if I should file a bug anyway.
>>>>> >
>>>>> >
>>>>> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <tfiala at google.com>
>>>>> wrote:
>>>>> > I'd suggest trying the change, running the tests, and see if
>>>>> anything new fails.
>>>>> >
>>>>> > And if this fixes currently broken behavior, add a test to make sure
>>>>> we don't regress it.  I'm pretty sure if it breaks something we'll know
>>>>> pretty quickly (either via bugs or our own usage).  At which point - we
>>>>> should add a test so we don't regress in the future and perhaps add a few
>>>>> comments as to the reason.
>>>>> >
>>>>> >
>>>>> > On Tue, Mar 25, 2014 at 9:54 AM, <jingham at apple.com> wrote:
>>>>> > Why are you waiting for process groups?  That's not something we
>>>>> have to do on Mac OS X.
>>>>> >
>>>>> > Jim
>>>>> >
>>>>> >
>>>>> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson <
>>>>> andrew.macp at gmail.com> wrote:
>>>>> >
>>>>> > > Currently under Linux if you attach to a process whose process
>>>>> group id is not equal to its process id (such as the child process of a
>>>>> fork() call) the calls to waitpid() that pass -1*pid will return ECHILD
>>>>> since the pid argument refers to a process group that doesn't exist. These
>>>>> calls occur in Host::MonitorChildProcessThreadFunction() and the Linux
>>>>> ProcessMonitor.
>>>>> > >
>>>>> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the
>>>>> issue but it's not clear if this is the right fix as I'm unsure how other
>>>>> OSes deal with this scenario.
>>>>> > >
>>>>> > > Any thoughts?
>>>>> > >
>>>>> > > Thanks,
>>>>> > > Andrew
>>>>> > > _______________________________________________
>>>>> > > lldb-dev mailing list
>>>>> > > lldb-dev at cs.uiuc.edu
>>>>> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>> >
>>>>> > _______________________________________________
>>>>> > lldb-dev mailing list
>>>>> > lldb-dev at cs.uiuc.edu
>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>>>> 650-943-3180
>>>>> >
>>>>> >
>>>>> >
>>>>> <waitpid-getpgid.patch>_______________________________________________
>>>>> > lldb-dev mailing list
>>>>> > lldb-dev at cs.uiuc.edu
>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Todd Fiala | Software Engineer |  tfiala at google.com |  650-943-3180
>>>
>>
>>
>
>
> --
> Todd Fiala | Software Engineer |  tfiala at google.com |  650-943-3180
>



-- 
Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140401/2d7b247e/attachment.html>


More information about the lldb-dev mailing list