[lldb-dev] "step" threading issues

Fri May 3 01:53:19 PDT 2013

Op 2-5-2013 19:47, jingham at apple.com schreef:
>
> On May 2, 2013, at 10:20 AM, Carlo Kok <ck at remobjects.com> wrote:
>
>> Op 2-5-2013 18:47, jingham at apple.com schreef:
>>>
>>> On May 2, 2013, at 3:41 AM, Carlo Kok <ck at remobjects.com> wrote:
>>>
>>>> Op 2-5-2013 12:03, Carlo Kok schreef:
>>>>> Op 29-4-2013 22:15, Carlo Kok schreef:
>>>>>> Op 29-4-2013 18:41, Carlo Kok schreef:
>>>>>>> Op 29-4-2013 18:23, Greg Clayton schreef:
>>>>>>>> You should only get one stopped event unless you are
>>>>>>>> hitting a breakpoint that continues your target. In
>>>>>>>> this case the eStateStopped event would be a
>>>>>>>> "restarted" event which can be found out by:
>>>>>>>>
>>>>>>>> static bool SBProcess::GetRestartedFromEvent (const
>>>>>>>> lldb::SBEvent &event);
>>>>>>>>
>>>>>>>> This means the program stopped but restarted
>>>>>>>> automatically. You should never see two eStateStopped
>>>>>>>> events in a row, if you are, please try and reproduce
>>>>>>>> on a Mac target and file a bug.
>>>>>>>
>>>>>>> Indeed that's my problem. I get several of those with
>>>>>>> reason "stop" for StepInto. I'm up to date on last weeks
>>>>>>> trunk update to windows; but I'll try to compile lldb on
>>>>>>> OSX to see if I can reproduce it there.
>>>>>>
>>>>>>
>>>>>> I get this from the log: http://pastebin.com/msnqdi6P
>>>>>>
>>>>>> at line 1656 it resumes it, yet still broadcast a "stop",
>>>>>> which makes no sense to me, nor can I find any way this
>>>>>> could happen, I do know it doesn't happen if i slowly step
>>>>>> through it.
>>>>>
>>>>> I've narrowed it down to this (line 1622):
>>>>>
>>>>> ThreadList::ShouldReportStop 3 threads
>>>>> Thread::ShouldReportStop() tid = 0x1a03: returning vote  for
>>>>> complete stack's back plan ... ThreadList::ShouldReportStop
>>>>> returning yes
>>>>>
>>>>> ShouldReportStop returns "Yes" because
>>>>> m_completed_plan_stack.count > 0 in which case it returns:
>>>>> return m_completed_plan_stack.back()->ShouldReportStop
>>>>> (event_ptr);
>>>>>
>>>>> Now the last thing in that is a: Vote
>>>>> ThreadPlanCallFunction::ShouldReportStop(Event *event_ptr) {
>>>>> if (m_takedown_done || IsPlanComplete()) return eVoteYes; <<
>>>>> goes here. else return
>>>>> ThreadPlan::ShouldReportStop(event_ptr); }
>>>>>
>>>>> Which is the call to
>>>>> g_lookup_implementation_no_stret_function_code.
>>>>>
>>>>> Does anyone have an idea what else I can check to solve this
>>>>> step into issue?
>>>>>
>>>>> I've not been able to reproduce this on Osx or Linux.
>>>>
>>>> If i change: if (m_completed_plan_stack.size() > 0) to: if
>>>> (m_completed_plan_stack.size() > 0 && m_plan_stack.size() ==
>>>> 0)
>>>>
>>>> in Thread::ShouldReportStop, it works perfectly.
>>>
>>> Yes, but that's because then this branch will never get called
>>> (m_plan_stack.size() is never 0, there's always a base plan.
>>>
>>> So this isn't a correct fix.
>>
>> I figured it wouldn't be that simple. However it cannot be right
>> that it a: resumes the process  and b: returns "yes let the public
>> api know we stopped" at the same time.
>
> I disagree.  You need that for instance to implement "process handle
> SOMESIG --stop false --print true".  You are auto-continuing, yet you
> want to tell the event-loop runner that this happened so that it can
> notify about it however is appropriate.  For instance in the case of
> the lldb driver we listen to this event and print some bit to the
> console.  But a GUI might want to do this in some different way, so I
> don't want to just dump something to stdout and hope somebody
> notices...
>
> Also we send an event if a breakpoint condition or command is hit but
> continues the process so that a UI would know to update hit counts in
> its breakpoint display.
>
> The stopped event always says it restarted (you can query this with
> the Process::ProcessEventData::GetRestartedFromEvent API.)  You just
> have to make sure you check that any time you get a stopped event.

ah. I was unaware of that call, however that does seem to fix (At least 
part) of it.
>
> I have to fix the ThreadPlan.h docs to be more clear about how
> ShouldReportStop works, however (and I should change its name to be a
> little more explicit.)  ShouldReportStop only gets called if the
> process is going to auto-continue after the stop.  That makes sense,
> I can't see why you would want to have the process really stop and
> NOT tell the agent running the event loop about it.  But it isn't
> clear from the name.  Whoever did ThreadPlanCallFunction probably
> didn't realize this, since it shouldn't be returning true from
> ShouldReportStop.  After all, if some thread plan ran a function and
> decided on the basis of the results of that function call to
> auto-continue, then there's no reason to tell the outside world about
> that.  I'd have to think a little more carefully to be 100% sure that
> there aren't any cases where this would be useful, but I can't think
> of any right now.
>
> OTOH, it looks like the Linux port of LLDB is for some reason not
> resilient to these "auto-continue" events.  That puzzles me, since
> this should all be handled in generic execution control logic, and
> this sort of thing causes no problems on OS X.

I "solved" the issue with GetRestartedFromEvent but then it crashed on 
the slim multi read/single write code, which on windows uses an internal 
api of Windows. It appears that it did a write unlock twice, without 
lock in between(and windows then crashes on the next lock operation) in 
Process::SetPublicState (StateType new_state), this might cause issues 
on other os'es too, since I doubt pthread guarantees that that works