[lldb-dev] "step" threading issues
    jingham at apple.com 
    jingham at apple.com
       
    Fri May  3 08:26:34 PDT 2013
    
    
  
On May 3, 2013, at 1:53 AM, Carlo Kok <ck at remobjects.com> wrote:
> Op 2-5-2013 19:47, jingham at apple.com schreef:
>> 
>> On May 2, 2013, at 10:20 AM, Carlo Kok <ck at remobjects.com> wrote:
>> 
>>> Op 2-5-2013 18:47, jingham at apple.com schreef:
>>>> 
>>>> On May 2, 2013, at 3:41 AM, Carlo Kok <ck at remobjects.com> wrote:
>>>> 
>>>>> Op 2-5-2013 12:03, Carlo Kok schreef:
>>>>>> Op 29-4-2013 22:15, Carlo Kok schreef:
>>>>>>> Op 29-4-2013 18:41, Carlo Kok schreef:
>>>>>>>> Op 29-4-2013 18:23, Greg Clayton schreef:
>>>>>>>>> You should only get one stopped event unless you are
>>>>>>>>> hitting a breakpoint that continues your target. In
>>>>>>>>> this case the eStateStopped event would be a
>>>>>>>>> "restarted" event which can be found out by:
>>>>>>>>> 
>>>>>>>>> static bool SBProcess::GetRestartedFromEvent (const
>>>>>>>>> lldb::SBEvent &event);
>>>>>>>>> 
>>>>>>>>> This means the program stopped but restarted
>>>>>>>>> automatically. You should never see two eStateStopped
>>>>>>>>> events in a row, if you are, please try and reproduce
>>>>>>>>> on a Mac target and file a bug.
>>>>>>>> 
>>>>>>>> Indeed that's my problem. I get several of those with
>>>>>>>> reason "stop" for StepInto. I'm up to date on last weeks
>>>>>>>> trunk update to windows; but I'll try to compile lldb on
>>>>>>>> OSX to see if I can reproduce it there.
>>>>>>> 
>>>>>>> 
>>>>>>> I get this from the log: http://pastebin.com/msnqdi6P
>>>>>>> 
>>>>>>> at line 1656 it resumes it, yet still broadcast a "stop",
>>>>>>> which makes no sense to me, nor can I find any way this
>>>>>>> could happen, I do know it doesn't happen if i slowly step
>>>>>>> through it.
>>>>>> 
>>>>>> I've narrowed it down to this (line 1622):
>>>>>> 
>>>>>> ThreadList::ShouldReportStop 3 threads
>>>>>> Thread::ShouldReportStop() tid = 0x1a03: returning vote  for
>>>>>> complete stack's back plan ... ThreadList::ShouldReportStop
>>>>>> returning yes
>>>>>> 
>>>>>> ShouldReportStop returns "Yes" because
>>>>>> m_completed_plan_stack.count > 0 in which case it returns:
>>>>>> return m_completed_plan_stack.back()->ShouldReportStop
>>>>>> (event_ptr);
>>>>>> 
>>>>>> Now the last thing in that is a: Vote
>>>>>> ThreadPlanCallFunction::ShouldReportStop(Event *event_ptr) {
>>>>>> if (m_takedown_done || IsPlanComplete()) return eVoteYes; <<
>>>>>> goes here. else return
>>>>>> ThreadPlan::ShouldReportStop(event_ptr); }
>>>>>> 
>>>>>> Which is the call to
>>>>>> g_lookup_implementation_no_stret_function_code.
>>>>>> 
>>>>>> Does anyone have an idea what else I can check to solve this
>>>>>> step into issue?
>>>>>> 
>>>>>> I've not been able to reproduce this on Osx or Linux.
>>>>> 
>>>>> If i change: if (m_completed_plan_stack.size() > 0) to: if
>>>>> (m_completed_plan_stack.size() > 0 && m_plan_stack.size() ==
>>>>> 0)
>>>>> 
>>>>> in Thread::ShouldReportStop, it works perfectly.
>>>> 
>>>> Yes, but that's because then this branch will never get called
>>>> (m_plan_stack.size() is never 0, there's always a base plan.
>>>> 
>>>> So this isn't a correct fix.
>>> 
>>> I figured it wouldn't be that simple. However it cannot be right
>>> that it a: resumes the process  and b: returns "yes let the public
>>> api know we stopped" at the same time.
>> 
>> I disagree.  You need that for instance to implement "process handle
>> SOMESIG --stop false --print true".  You are auto-continuing, yet you
>> want to tell the event-loop runner that this happened so that it can
>> notify about it however is appropriate.  For instance in the case of
>> the lldb driver we listen to this event and print some bit to the
>> console.  But a GUI might want to do this in some different way, so I
>> don't want to just dump something to stdout and hope somebody
>> notices...
>> 
>> Also we send an event if a breakpoint condition or command is hit but
>> continues the process so that a UI would know to update hit counts in
>> its breakpoint display.
>> 
>> The stopped event always says it restarted (you can query this with
>> the Process::ProcessEventData::GetRestartedFromEvent API.)  You just
>> have to make sure you check that any time you get a stopped event.
> 
> 
> ah. I was unaware of that call, however that does seem to fix (At least part) of it.
>> 
>> I have to fix the ThreadPlan.h docs to be more clear about how
>> ShouldReportStop works, however (and I should change its name to be a
>> little more explicit.)  ShouldReportStop only gets called if the
>> process is going to auto-continue after the stop.  That makes sense,
>> I can't see why you would want to have the process really stop and
>> NOT tell the agent running the event loop about it.  But it isn't
>> clear from the name.  Whoever did ThreadPlanCallFunction probably
>> didn't realize this, since it shouldn't be returning true from
>> ShouldReportStop.  After all, if some thread plan ran a function and
>> decided on the basis of the results of that function call to
>> auto-continue, then there's no reason to tell the outside world about
>> that.  I'd have to think a little more carefully to be 100% sure that
>> there aren't any cases where this would be useful, but I can't think
>> of any right now.
>> 
>> OTOH, it looks like the Linux port of LLDB is for some reason not
>> resilient to these "auto-continue" events.  That puzzles me, since
>> this should all be handled in generic execution control logic, and
>> this sort of thing causes no problems on OS X.
> 
> I "solved" the issue with GetRestartedFromEvent but then it crashed on the slim multi read/single write code, which on windows uses an internal api of Windows. It appears that it did a write unlock twice, without lock in between(and windows then crashes on the next lock operation) in Process::SetPublicState (StateType new_state), this might cause issues on other os'es too, since I doubt pthread guarantees that that works
MacOS X seems to not to care about this, but it still is not something we should allow to happen.  I'm currently going through cleaning up all the cases where this happens in the current testsuite.  If my brains don't fail me I should be done early next week.
Jim
    
    
More information about the lldb-dev
mailing list