[Lldb-commits] [lldb] r180829 - Make "process handle -n true -s false" actually notifies of the signal.

jingham at apple.com jingham at apple.com
Thu May 2 14:53:35 PDT 2013


On May 2, 2013, at 2:06 PM, "Malea, Daniel" <daniel.malea at intel.com> wrote:

> Right; what's happening is the call to WaitForNextStateChangeEvent (on
> line 1240) never returns. After reading the other thread with Carlo, it's
> evident my fix is not correct, so I reverted it.
> 
> The GetRestartedFromEvent code I think is OK, as I'm able to see other
> Stop events that go through the check on line 1255
> 
> I tried reproducing the hang with "log enable lldb process" and generated
> the attached log file. The part that's curious is right before the hang,
> I'm seeing:
> 
> Process::WaitForEventsPrivate (timeout = (nil), event_sp)...
> Process::SetPublicState (stopped)
> Process::SetPublicState (stopped) -- unlocking run lock
> Process::WaitForStateChangedEvents (timeout = (nil), event_sp) => stopped
> Process::Resume -- locking run lock
> Process::Resume: -- WriteTryLock failed, not resuming.
> Process::WaitForStateChangedEvents (timeout = (nil), event_sp)...

That explains why the test is hanging, since we tried to resume, but failed, and then of course you're never going to get another event ever...

So the next thing to figure out is who is calling this failing Resume.  If you can debug this and break on point in Resume where we fail to get the write lock, we should be able to see who is trying this resume.    Then we should look at whether in this case we can do something more intelligent than just silently not resume in this case.  And also why we didn't get the run lock when we thought we should be able to.

This is sort of back to our discussion about resuming and the run lock from a couple of weeks back.  I've been trying to get my thoughts about this clearer.  We have a couple of types of resumes, really:  There's the "I want to be the one who starts a user level resume" type resumes.  That's like the SBProcess::Step type commands.  

Then there are places where we resume on somebody else's behalf.  The step plans when they resume for various reasons (for single stepping, or after hitting a step through or step out breakpoint), and the breakpoint condition resumes are examples of this.  

And to make things more confusing, there are "resumes when we are pretending that we aren't resuming". An example of the latter is when you go to run a function.  When you go to call a function, you want to say "I want to make sure the target stays stopped" then you call down into EvaluateExpression which is going to run the target, but again on your behalf.

Presently the two different types of resume are distinguished by whether we call Resume (a user level resume) or PrivateResume, which is done on somebody else's behalf.  I think the problem is that there are places where which type of resume you intend is situational, so you can't tell just by where in the code you happen to be.  There needs to be some explicit granting of authority, so we can track this better.  At least that's the place where my thoughts are at present...

Anyway, if you can see what is going on when this resume fails, that will help figure out what to do about it.

Jim


> 
> 
> 
> 
> Any thoughts?
> 
> Dan
> 
> 
> 
> On 2013-05-02 4:38 PM, "jingham at apple.com" <jingham at apple.com> wrote:
> 
>> I'm confused.  In Process::WaitForProcessToStop, (line 1255 seq.), I see:
>> 
>>       case eStateStopped:
>>           if 
>> (Process::ProcessEventData::GetRestartedFromEvent(event_sp.get()))
>>               continue;
>>           else
>>               return state;
>> 
>> Is that not working somehow?
>> 
>> Jim
>> 
>> 
>> On May 2, 2013, at 12:24 PM, "Malea, Daniel" <daniel.malea at intel.com>
>> wrote:
>> 
>>> I looked into the hanging case a little more today. It seems that (as
>>> per
>>> the stack trace below) the hang is taking place in
>>> Process::WaitForProcessToStop() when the process state (returned by
>>> GetState) is eStateStopped.
>>> 
>>> It seems that if the state is stopped, that function is guaranteed to
>>> wait
>>> forever for new events (which never come)Š I tried a quick fix that
>>> added
>>> such a check, but that caused other problems (unexpected process exit
>>> was
>>> reported).
>>> 
>>> Is my fix along the below lines correct? Or is the
>>> WaitForProcessToStop()
>>> indeed correct the way it's implemented? Maybe there's extra events that
>>> get generated on Mac OS X while a process is stopped; thereby preventing
>>> the wait-forever situation?
>>> 
>>> --- a/source/Target/Process.cpp
>>> +++ b/source/Target/Process.cpp
>>> @@ -1238,6 +1238,11 @@ Process::WaitForProcessToStop (const TimeValue
>>> *timeout, lldb::EventSP *event_sp
>>>    if (state == eStateDetached || state == eStateExited)
>>>        return state;
>>> 
>>> +    // If we are already stopped, return now otherwise we wait forever
>>> +    if (state == eStateStopped
>>> +        && ! 
>>> Process::ProcessEventData::GetRestartedFromEvent(event_sp_ptr->get()))
>>> +        return state;
>>> +
>>> 
>>> 
>>> 
>>> Any hints appreciated :)
>>> 
>>> Cheers,
>>> Dan
>>> 
>>> 
>>> 
>>> On 2013-05-01 10:41 PM, "jingham at apple.com" <jingham at apple.com> wrote:
>>> 
>>>> Daniel,
>>>> 
>>>> This test runs just fine on Mac OS X with the change I made, so it is a
>>>> little surprising that it causes problems on Linux.  After all, the
>>>> change is all in generic execution control code.
>>>> 
>>>> I looked at the effects of this change in a little more detail, and it
>>>> was uncovering one unrelated bug.  That was in how the "step over" plan
>>>> works when it is using the "run to next range breakpoint".  When it hit
>>>> the next range breakpoint, it would correctly understand that it had
>>>> explained the stop when asked in ShouldStop, but then it deleted the
>>>> breakpoint, so that when it got asked again in ShouldReportStop, it
>>>> thought it didn't explain the stop.  So the Base thread plan would get
>>>> asked ShouldReportStop instead.
>>>> 
>>>> The change in this patch makes it so that the Base thread plan now does
>>>> something when asked ShouldReportStop.  This was required because if we
>>>> get a stop which auto-resumes (for instance a Signal which you are not
>>>> going to stop at but do want to print OR a breakpoint whose condition
>>>> or
>>>> command auto-continues) when we're doing nothing in particular, or when
>>>> we stop for a reason not explained by any of the plans higher on the
>>>> stack, then it will cause a "Stopped and restarted" event to get
>>>> posted.
>>>> Before it was silently eating those events which was wrong.
>>>> 
>>>> So the result of the bug in how the "step over" plan works with the
>>>> patch
>>>> in place, meant that we were getting a few extraneous "stopped and
>>>> restarted" events.  This didn't cause any problems on Mac OS X (the
>>>> test
>>>> case you mention runs fine there.)  And indeed that's not surprising.
>>>> When doing a "step over" you might get all sorts of stopped & restarted
>>>> events, from signals or auto-continued breakpoints, for instance.
>>>> Whoever is running the event loop (e.g. the lldb driver) should notify
>>>> the user about them (or not - often we just ignore them to keep from
>>>> being overly chatty) and then go on its way.
>>>> 
>>>> So I can (and will) fix it so that you don't get these extraneous
>>>> "stopped and restarted" events.  They cause unnecessary noise.  And
>>>> doing
>>>> that will most likely make this test case stop hanging.  But that
>>>> doesn't
>>>> really make me feel any better, since the main lldb execution control
>>>> logic should have withstood them no problem (as it did on OS X), and in
>>>> other circumstances it will have to.
>>>> 
>>>> I don't have the time to look into why the Linux port horks up on
>>>> these,
>>>> but it is something interesting to investigate...
>>>> 
>>>> Jim
>>>> 
>>>> 
>>>> On May 1, 2013, at 9:03 AM, "Malea, Daniel" <daniel.malea at intel.com>
>>>> wrote:
>>>> 
>>>>> Hi Jim, this commit seems to cause some hangs in
>>>>> TestTypeCompletion.py,
>>>>> can you take a look (or revert) please. I can provide
>>>>> 
>>>>> http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/2754
>>>>> 
>>>>> #0  pthread_cond_wait@@GLIBC_2.3.2 () at
>>>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>>>> #1  0x00007f8801b2d8f7 in lldb_private::Condition::Wait
>>>>> (this=0x1b52b80,
>>>>> mutex=..., abstime=0x0, timed_out=0x7fff766e985e) at
>>>>> ../tools/lldb/source/Host/common/Condition.cpp:92
>>>>> #2  0x00007f8801b2c8a5 in
>>>>> lldb_private::Predicate<bool>::WaitForValueEqualTo (this=0x1b52b50,
>>>>> value=true, abstime=0x0, timed_out=0x7fff766e985e) at
>>>>> ../tools/lldb/include/lldb/Host/Predicate.h:338
>>>>> #3  0x00007f8801b66e82 in
>>>>> lldb_private::Listener::WaitForEventsInternal
>>>>> (this=0x1b52ab8, timeout=0x0, broadcaster=0x1cabd48,
>>>>> broadcaster_names=0x0, num_broadcaster_names=0, event_type_mask=3,
>>>>> event_sp=...) at ../tools/lldb/source/Core/Listener.cpp:433
>>>>> #4  0x00007f8801b66f74 in
>>>>> lldb_private::Listener::WaitForEventForBroadcasterWithType
>>>>> (this=0x1b52ab8, timeout=0x0, broadcaster=0x1cabd48,
>>>>> event_type_mask=3,
>>>>> event_sp=...) at ../tools/lldb/source/Core/Listener.cpp:464
>>>>> #5  0x00007f8801d152fd in
>>>>> lldb_private::Process::WaitForStateChangedEvents
>>>>> (this=0x1cabd10, timeout=0x0, event_sp=...) at
>>>>> ../tools/lldb/source/Target/Process.cpp:1319
>>>>> #6  0x00007f8801d1503b in lldb_private::Process::WaitForProcessToStop
>>>>> (this=0x1cabd10, timeout=0x0, event_sp_ptr=0x0) at
>>>>> ../tools/lldb/source/Target/Process.cpp:1224
>>>>> #7  0x00007f8801ecbce5 in
>>>>> CommandObjectThreadStepWithTypeAndScope::DoExecute (this=0x1b892c0,
>>>>> command=..., result=...) at
>>>>> ../tools/lldb/source/Commands/CommandObjectThread.cpp:542
>>>>> #8  0x00007f8801c57be9 in lldb_private::CommandObjectParsed::Execute
>>>>> (this=0x1b892c0, args_string=0x7f88003cb4d8 "", result=...) at
>>>>> ../tools/lldb/source/Interpreter/CommandObject.cpp:969
>>>>> #9  0x00007f8801c4bea2 in
>>>>> lldb_private::CommandInterpreter::HandleCommand
>>>>> (this=0x1b52cf0, command_line=0x7f8807d0fd14 "next",
>>>>> lazy_add_to_history=lldb_private::eLazyBoolNo, result=...,
>>>>> override_context=0x0, repeat_on_empty_command=true,
>>>>> no_context_switching=false)
>>>>>  at ../tools/lldb/source/Interpreter/CommandInterpreter.cpp:1813
>>>>> #10 0x00007f8801a75a7b in lldb::SBCommandInterpreter::HandleCommand
>>>>> (this=0x1c338b0, command_line=0x7f8807d0fd14 "next", result=...,
>>>>> add_to_history=false) at
>>>>> ../tools/lldb/source/API/SBCommandInterpreter.cpp:122
>>>>> #11 0x00007f88019ee5dd in
>>>>> _wrap_SBCommandInterpreter_HandleCommand__SWIG_1
>>>>> (args=(<SBCommandInterpreter(this=<SwigPyObject at remote 0x1c5d3c0>)
>>>>> at
>>>>> remote 0x1c53050>, 'next', <SBCommandReturnObject(this=<SwigPyObject
>>>>> at
>>>>> remote 0x1c5d3f0>) at remote 0x1c53090>))
>>>>>  at tools/lldb/scripts/LLDBWrapPython.cpp:9180
>>>>> #12 0x00007f88019ee7ce in _wrap_SBCommandInterpreter_HandleCommand
>>>>> (self=0x0, args=(<SBCommandInterpreter(this=<SwigPyObject at remote
>>>>> 0x1c5d3c0>) at remote 0x1c53050>, 'next',
>>>>> <SBCommandReturnObject(this=<SwigPyObject at remote 0x1c5d3f0>) at
>>>>> remote
>>>>> 0x1c53090>))
>>>>>  at tools/lldb/scripts/LLDBWrapPython.cpp:9215
>>>>> #13 0x0000000000463ea7 in ext_do_call (nk=29085856, na=<optimized
>>>>> out>,
>>>>> flags=<optimized out>, pp_stack=0x7fff766ea090, func=<built-in
>>>>> function
>>>>> SBCommandInterpreter_HandleCommand>) at ../Python/ceval.c:4331
>>>>> #14 PyEval_EvalFrameEx (f=f at entry=
>>>>>  Frame 0x1c92540, for file
>>>>> 
>>>>> 
>>>>> /home/daniel/dev/llvm/build/lib/python2.7/site-packages/lldb/__init__.p
>>>>> y,
>>>>> line 1633, in HandleCommand
>>>>> (self=<SBCommandInterpreter(this=<SwigPyObject
>>>>> at remote 0x1c5d3c0>) at remote 0x1c53050>, args=('next',
>>>>> <SBCommandReturnObject(this=<SwigPyObject at remote 0x1c5d3f0>) at
>>>>> remote
>>>>> 0x1c53090>)), throwflag=throwflag at entry=0) at ../Python/ceval.c:2705
>>>>> #15 0x0000000000467209 in PyEval_EvalCodeEx (co=0x1a5a930,
>>>>> globals=<optimized out>, locals=locals at entry=0x0, args=<optimized
>>>>> out>,
>>>>> argcount=argcount at entry=3, kws=0x1c924c8, kwcount=0, defs=0x0,
>>>>> defcount=0,
>>>>> closure=closure at entry=0x0) at ../Python/ceval.c:3253
>>>>> #16 0x000000000045ff77 in fast_function (nk=<optimized out>, na=3,
>>>>> n=<optimized out>, pp_stack=0x7fff766ea320, func=<function at remote
>>>>> 0x1973c80>) at ../Python/ceval.c:4117
>>>>> #17 call_function (oparg=<optimized out>, pp_stack=0x7fff766ea320) at
>>>>> ../Python/ceval.c:4042
>>>>> #18 PyEval_EvalFrameEx (f=f at entry=
>>>>>  Frame 0x1c922f0, for file
>>>>> /home/daniel/dev/llvm/tools/lldb/test/lldbtest.py, line 1354, in
>>>>> runCmd
>>>>> (self=<TypeCompletionTestCase(subprocesses=[],
>>>>> doTearDownCleanups=False,
>>>>> lldbOption='--no-lldbinit', hooks=[<function at remote 0x1c55b18>],
>>>>> __failed__=False, dbg=<SBDebugger(this=<SwigPyObject at remote
>>>>> 0x1b91fc0>)
>>>>> at remote 0x1af4a90>, _cleanups=[], session=<StringIO(softspace=0,
>>>>> buflist=['\nos command: [[\'/bin/sh\', \'-c\', \'make clean; make
>>>>> MAKE_DSYM=NO ARCH=x86_64 CC=clang\']]\nwith pid: 3076\nstdout: rm -rf
>>>>> "a.out" ""  main.o main.d  \nclang++ -g -O0 -m64     -c -o main.o
>>>>> main.cpp\nclang++  main.o -g -O0 -m64     -o "a.out"\n\nstderr:
>>>>> \nretcode:
>>>>> 0\n\n', '\n', "runCmd: file a.out\noutput: Current executable set to
>>>>> 'a.out' (x86_64).\n\n", '\n', 'runCmd: breakpoint set -f "main.cpp" -l
>>>>> 37\noutput: Breakpoint 1: 2 locations.\n\n', '\n', 'Expecting pattern:
>>>>> ^Breakpoint (?P<bpno>[0-9]+): (?P<num_locations>[0-9]+)
>>>>> locations\\.$\nMatched\n', '\n', "runCmd: run\noutput: Process 3102
>>>>> launched: '/home/daniel/dev/llv...(truncated),
>>>>> throwflag=throwflag at entry=0) at ../Python/ceval.c:2666
>>>>> #19 0x0000000000467209 in PyEval_EvalCodeEx (co=0x1b900b0,
>>>>> globals=<optimized out>, locals=locals at entry=0x0, args=<optimized
>>>>> out>,
>>>>> argcount=argcount at entry=2, kws=0x1c913c8, kwcount=0, defs=0x19aae78,
>>>>> defcount=3, closure=closure at entry=0x0) at ../Python/ceval.c:3253
>>>>> #20 0x000000000045ff77 in fast_function (nk=<optimized out>, na=2,
>>>>> n=<optimized out>, pp_stack=0x7fff766ea5b0, func=<function at remote
>>>>> 0x1b94aa0>) at ../Python/ceval.c:4117
>>>>> #21 call_function (oparg=<optimized out>, pp_stack=0x7fff766ea5b0) at
>>>>> ../Python/ceval.c:4042
>>>>> #22 PyEval_EvalFrameEx (f=f at entry=
>>>>>  Frame 0x1c91200, for file
>>>>> 
>>>>> 
>>>>> /home/daniel/dev/llvm/tools/lldb/test/functionalities/type_completion/T
>>>>> es
>>>>> tT
>>>>> ypeCompletion.py, line 65, in type_completion_commands
>>>>> (self=<TypeCompletionTestCase(subprocesses=[],
>>>>> doTearDownCleanups=False,
>>>>> lldbOption='--no-lldbinit', hooks=[<function at remote 0x1c55b18>],
>>>>> __failed__=False, dbg=<SBDebugger(this=<SwigPyObject at remote
>>>>> 0x1b91fc0>)
>>>>> at remote 0x1af4a90>, _cleanups=[], session=<StringIO(softspace=0,
>>>>> buflist=['\nos command: [[\'/bin/sh\', \'-c\', \'make clean; make
>>>>> MAKE_DSYM=NO ARCH=x86_64 CC=clang\']]\nwith pid: 3076\nstdout: rm -rf
>>>>> "a.out" ""  main.o main.d  \nclang++ -g -O0 -m64     -c -o main.o
>>>>> main.cpp\nclang++  main.o -g -O0 -m64     -o "a.out"\n\nstderr:
>>>>> \nretcode:
>>>>> 0\n\n', '\n', "runCmd: file a.out\noutput: Current executable set to
>>>>> 'a.out' (x86_64).\n\n", '\n', 'runCmd: breakpoint set -f "main.cpp" -l
>>>>> 37\noutput: Breakpoint 1: 2 locations.\n\n', '\n', 'Expecting pattern:
>>>>> ^Breakpoint (?P<bpno>[0-9]+): (?P<num_locations>[0-9]+)
>>>>> locations\\.$\nMatched\n', '\n', "runCmd:...(truncated),
>>>>> throwflag=throwflag at entry=0) at ../Python/ceval.c:2666
>>>>> #23 0x0000000000467209 in PyEval_EvalCodeEx (co=0x1af2e30,
>>>>> globals=<optimized out>, locals=locals at entry=0x0, args=<optimized
>>>>> out>,
>>>>> argcount=argcount at entry=1, kws=0x0, kwcount=0, defs=0x0, defcount=0,
>>>>> closure=closure at entry=0x0) at ../Python/ceval.c:3253
>>>>> #24 0x000000000045ff77 in fast_function (nk=<optimized out>, na=1,
>>>>> n=<optimized out>, pp_stack=0x7fff766ea840, func=<function at remote
>>>>> 0x1ba1b18>) at ../Python/ceval.c:4117
>>>>> #25 call_function (oparg=<optimized out>, pp_stack=0x7fff766ea840) at
>>>>> ../Python/ceval.c:4042
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Dan
>>>>> 
>>>>> On 2013-04-30 7:42 PM, "Jim Ingham" <jingham at apple.com> wrote:
>>>>> 
>>>>>> Author: jingham
>>>>>> Date: Tue Apr 30 18:42:24 2013
>>>>>> New Revision: 180829
>>>>>> 
>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=180829&view=rev
>>>>>> Log:
>>>>>> Make "process handle -n true -s false" actually notifies of the
>>>>>> signal.
>>>>>> 
>>>>>> rdar://problem/12020085
>>>>>> 
>>>>>> Modified:
>>>>>> lldb/trunk/include/lldb/Target/ThreadPlanBase.h
>>>>>> lldb/trunk/source/Target/ThreadPlanBase.cpp
>>>>>> 
>>>>>> Modified: lldb/trunk/include/lldb/Target/ThreadPlanBase.h
>>>>>> URL: 
>>>>>> 
>>>>>> 
>>>>>> http://llvm.org/viewvc/llvm-project/lldb/trunk/include/lldb/Target/Thr
>>>>>> ea
>>>>>> dP
>>>>>> lanBase.h?rev=180829&r1=180828&r2=180829&view=diff
>>>>>> 
>>>>>> 
>>>>>> ======================================================================
>>>>>> ==
>>>>>> ==
>>>>>> ====
>>>>>> --- lldb/trunk/include/lldb/Target/ThreadPlanBase.h (original)
>>>>>> +++ lldb/trunk/include/lldb/Target/ThreadPlanBase.h Tue Apr 30
>>>>>> 18:42:24
>>>>>> 2013
>>>>>> @@ -37,6 +37,7 @@ public:
>>>>>>  virtual bool ValidatePlan (Stream *error);
>>>>>>  virtual bool PlanExplainsStop (Event *event_ptr);
>>>>>>  virtual bool ShouldStop (Event *event_ptr);
>>>>>> +    virtual Vote ShouldReportStop (Event *event_ptr);
>>>>>>  virtual bool StopOthers ();
>>>>>>  virtual lldb::StateType GetPlanRunState ();
>>>>>>  virtual bool WillStop ();
>>>>>> 
>>>>>> Modified: lldb/trunk/source/Target/ThreadPlanBase.cpp
>>>>>> URL: 
>>>>>> 
>>>>>> 
>>>>>> http://llvm.org/viewvc/llvm-project/lldb/trunk/source/Target/ThreadPla
>>>>>> nB
>>>>>> as
>>>>>> e.cpp?rev=180829&r1=180828&r2=180829&view=diff
>>>>>> 
>>>>>> 
>>>>>> ======================================================================
>>>>>> ==
>>>>>> ==
>>>>>> ====
>>>>>> --- lldb/trunk/source/Target/ThreadPlanBase.cpp (original)
>>>>>> +++ lldb/trunk/source/Target/ThreadPlanBase.cpp Tue Apr 30 18:42:24
>>>>>> 2013
>>>>>> @@ -78,6 +78,22 @@ ThreadPlanBase::PlanExplainsStop (Event
>>>>>>      return true;
>>>>>> }
>>>>>> 
>>>>>> +Vote
>>>>>> +ThreadPlanBase::ShouldReportStop(Event *event_ptr)
>>>>>> +{
>>>>>> +    StopInfoSP stop_info_sp = GetPrivateStopReason();
>>>>>> +    if (stop_info_sp)
>>>>>> +    {
>>>>>> +        bool should_notify = stop_info_sp->ShouldNotify(event_ptr);
>>>>>> +        if (should_notify)
>>>>>> +            return eVoteYes;
>>>>>> +        else
>>>>>> +            return eVoteNoOpinion;
>>>>>> +    }
>>>>>> +    else
>>>>>> +        return eVoteNoOpinion;
>>>>>> +}
>>>>>> +
>>>>>> bool
>>>>>> ThreadPlanBase::ShouldStop (Event *event_ptr)
>>>>>> {
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> lldb-commits mailing list
>>>>>> lldb-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>> 
> 
> <180829_process_lock_hang.txt>





More information about the lldb-commits mailing list