[lldb-dev] More linux process control and IOHandler races

Matthew Gardiner mg11 at csr.com
Thu Aug 7 06:29:17 PDT 2014


Hi Shawn,

I spent some time today looking at how to arrange for ShouldBroadcast to 
return false for this first stop. I managed to produce a quick hack for 
this (i.e. just counted the number of stops), but due to other 
distractions (from the rest of my job) I didn't get that far into 
discovering a nice way of achieving this...

What I did discover is that with my build just doing "process launch" 
results in 3 stops (and 3 private resumes). That in itself I find 
surprising, since I was under the impression that I should see the 
inferior stop just once when exec is trapped by PTRACE_ME.

I then discovered that for each of these 3 stops ShouldBroadcast calls 
Thread::ShouldStop, the Thread::ShouldStop returns true for the first 
stop and false for the other 2. Looking into these behaviour differences 
I then found that from within Thread::ShouldStop we then call into the 
following:

     StopInfoSP private_stop_info (GetPrivateStopInfo());
     if (private_stop_info && 
private_stop_info->ShouldStopSynchronous(event_ptr) == false)
     {

and also

bool over_ride_stop = current_plan->ShouldAutoContinue(event_ptr);

the results from either of these, it seems providing the reasoning 
behind the different true/false returns. I'll return to spend a bit more 
time on this tomorrow. Let me know if you get any further on a similar vein!

thanks
Matt



Shawn Best wrote:
> Matt,
>
> I think you are probably right, although there are other places where 
> it directly calls SetPublicState().   I was wondering about the 
> possibility there could be other listeners waiting for a broadcast 
> public Stop event.  Is that a possibility?
>
> Some others here were investigating some unit tests that were failing 
> intermittently (StopHook).  Their description of the problem sounds 
> unrelated to the launch code, but this patch also magically fixes that.
>
> Shawn.
>
> On 8/6/2014 6:26 AM, Matthew Gardiner wrote:
>> Shawn,
>>
>> Like I said earlier your patch worked. However I think the right fix 
>> is to arrange that ShouldBroadcast returns false for this first stop. 
>> I believe this, because firstly no stops should be reported here 
>> since the user is only interested in launching a program, and 
>> additionally because it enables us to fix lldb without removing the 
>> call to HandlePrivateEvent. This, I think, is important to preserve 
>> as the central point for process state change handling.
>>
>> Matt
>>
>>
>>
>> Shawn Best wrote:
>>> Hi Matthew,
>>>
>>> I have also been tracking this bug.  I believe there are other bugs 
>>> in the unit tests failing indirectly because of this.  I also have a 
>>> patch that will fix it, but was sitting on it until the other one 
>>> landed.  These bugs do not show up on OSX since the inferiors are 
>>> launched separately then attached to.
>>>
>>> The first odd thing the launching code does is push an IOHandler 
>>> when it sees the state transition to 'launching'. This is odd 
>>> because I believe the launching program will always come up in a 
>>> stopped state which will immediately pop the IOHandler.
>>>
>>> At launch, the process comes up in the stopped state.  The launch 
>>> code manually calls HandlePrivateEvent() with the stop event, which 
>>> then broadcasts the Event.  When HandleProcessEvent gets the public 
>>> stop, it dumps out the current thread state just as if an executing 
>>> inferior hit a breakpoint and stopped.
>>>
>>> One way to fix this would be:
>>>
>>> 1. Don't push io handler when state is 'launching'
>>> 2. Instead of manually calling HandlePrivateEvent, call 
>>> SetPublicState().
>>>
>>> Alternately, we could try and debug why ShouldBroadcast() returns 
>>> true, but that appears to be by design since it is expecting the 
>>> public stop event to pop the IOHandler that had been pushed when 
>>> launching.
>>>
>>> I have attached a patch demonstrating this.  In conjunction with the 
>>> other patch for IOHandler race condition, it will fix a bunch of 
>>> this kind of behaviour.
>>>
>>> Shawn.
>>>
>>> On 8/5/2014 6:59 AM, Matthew Gardiner wrote:
>>>> Jim,
>>>>
>>>> I've been trying to debug an issue (I see it on 64-bit linux) 
>>>> where, I do "target create" and "process launch" and despite not 
>>>> requesting *stop at entry*, the first stop (which I believe is just 
>>>> the initial ptrace attach stop) is reported to the lldb command 
>>>> line. I added some fprintf to Process::HandlePrivateEvent, which 
>>>> counts the number of eStoppedState events seen and whether 
>>>> ShouldBroadcastEvent returns true for this event. Here's the output 
>>>> from my program with diagnostic:
>>>>
>>>> (lldb) target create ~/me/i64-hello.elf
>>>> Current executable set to '~/me/i64-hello.elf' (x86_64).
>>>> (lldb) process launch
>>>> MG Process::HandlePrivateEvent launching stopped_count 0 
>>>> should_broadcast 1
>>>> Process 31393 launching
>>>> MG Process::HandlePrivateEvent stopped stopped_count 1 
>>>> should_broadcast 1
>>>> MG Process::HandlePrivateEvent running stopped_count 1 
>>>> should_broadcast 1
>>>> Process 31393 launched: 'i64-hello.elf' (x86_64)
>>>> Process 31393 stopped
>>>> * thread #1: tid = 31393, 0x0000003675a011f0, name = 
>>>> 'i64-hello.elf', stop reason = trace
>>>>     frame #0: 0x0000003675a011f0
>>>> -> 0x3675a011f0:  movq   %rsp, %rdi
>>>>    0x3675a011f3:  callq  0x3675a046e0
>>>>    0x3675a011f8:  movq   %rax, %r12
>>>>    0x3675a011fb:  movl   0x21eb97(%rip), %eax
>>>> (lldb) MG Process::HandlePrivateEvent stopped stopped_count 2 
>>>> should_broadcast 0
>>>> MG Process::HandlePrivateEvent running stopped_count 2 
>>>> should_broadcast 0
>>>> MG Process::HandlePrivateEvent stopped stopped_count 3 
>>>> should_broadcast 0
>>>> MG Process::HandlePrivateEvent running stopped_count 3 
>>>> should_broadcast 0
>>>>
>>>> In summary, lldb reports the inferior to be stopped (even though 
>>>> /proc/pid/status and lldb "target list" say it is running). Clearly 
>>>> this is wrong (hence my earlier post).
>>>>
>>>> Am I correct in assuming that when  ShouldBroadcastEvent returns 
>>>> true this means that lldb should show this event to the debug user? 
>>>> (And thus hide other events where ShouldBroadcastEvent=false).
>>>>
>>>> What puzzled me was why ShouldBroadcastEvent return true for this 
>>>> very first stop. Is this a bug?
>>>>
>>>> I also spent sometime at ShouldBroadcastEvent and saw that this:
>>>>
>>>>         case eStateStopped:
>>>>         case eStateCrashed:
>>>>         case eStateSuspended:
>>>>         {
>>>>          ....
>>>>                 if (was_restarted || should_resume || 
>>>> m_resume_requested)
>>>>                 {
>>>>
>>>> evaluates as false, and hence the PrivateResume code is not 
>>>> called... does this seem buggy to you for this very first stop?
>>>>
>>>> I thought I'd try asking you, since in a previous mail from Greg, 
>>>> he cited you as being a thread-plan expert. (Hope that's ok!). I'd 
>>>> really appreciate your help in clarifying the above questions for 
>>>> me, and if you have time, giving me some ideas as to how to trace 
>>>> this one further e.g. how m_thread_list.ShouldStop and 
>>>> m_thread_list.ShouldReportStop should behave, etc.
>>>>
>>>> thanks for your help
>>>> Matt
>>>>
>>>> Matthew Gardiner wrote:
>>>>> Folks,
>>>>>
>>>>> In addition to the overlapping prompt race Shawn Best and myself 
>>>>> are looking at, I'm seeing another issue where if I launch a 
>>>>> process, I get a stop (presumably the in) being reported to the UI.
>>>>>
>>>>> (lldb) target create ~/mydir/i64-hello.elf
>>>>> Current executable set to '~/mydir/i64-hello.elf' (x86_64).
>>>>> (lldb) process launch
>>>>> Process 27238 launching
>>>>> Process 27238 launched: '64-hello.elf' (x86_64)
>>>>> Process 27238 stopped
>>>>> * thread #1: tid = 27238, 0x0000003675a011f0, name = 'i64-hello.elf'
>>>>>     frame #0:
>>>>> (lldb) target list
>>>>> Current targets:
>>>>> * target #0: i64-hello.elf ( arch=x86_64-unknown-linux, 
>>>>> platform=host, pid=27238, state=running )
>>>>> (lldb)
>>>>>
>>>>> As you can see the "target list" reflects that the process is 
>>>>> running. Which I confirmed by looking at /proc/27238/status.
>>>>>
>>>>> Anyone else seeing this?
>>>>>
>>>>> thanks
>>>>> Matt
>>>>>
>>>>>
>>>>>
>>>>> Member of the CSR plc group of companies. CSR plc registered in 
>>>>> England and Wales, registered number 4187346, registered office 
>>>>> Churchill House, Cambridge Business Park, Cowley Road, Cambridge, 
>>>>> CB4 0WZ, United Kingdom
>>>>> More information can be found at www.csr.com <http://www.csr.com>. 
>>>>> Keep up to date with CSR on our technical blog, www.csr.com/blog 
>>>>> <http://www.csr.com/blog>, CSR people blog, www.csr.com/people 
>>>>> <http://www.csr.com/people>, YouTube, www.youtube.com/user/CSRplc 
>>>>> <http://www.youtube.com/user/CSRplc>, Facebook, 
>>>>> www.facebook.com/pages/CSR/191038434253534 
>>>>> <http://www.facebook.com/pages/CSR/191038434253534>, or follow us 
>>>>> on Twitter at www.twitter.com/CSR_plc 
>>>>> <http://www.twitter.com/CSR_plc>.
>>>>> New for 2014, you can now access the wide range of products 
>>>>> powered by aptX at www.aptx.com <http://www.aptx.com>.
>>>>> _______________________________________________
>>>>> lldb-dev mailing list
>>>>> lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>>
>>>>>
>>>>> To report this email as spam click 
>>>>> https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw== 
>>>>> <https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==> 
>>>>> .
>>>>
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>
>>
>




More information about the lldb-dev mailing list