[Lldb-commits] [PATCH]race condition calling PushProcessIOHandler

Shawn Best sbest at blueshiftinc.com
Wed Aug 6 18:12:59 PDT 2014


Matt,

You asked earlier about the significance of the 2000us timeout. When I 
was testing an earlier patch:

do
{
     usleep(0);        // force context switch

     if(state==stopped)
         break;

} while(curTime < timeout)

I put some timers and counters in the loop to see how it was behaving.  
Most of the time, a single context switch was enough time to fix the 
race condition.  Occasionally it would take 3 context switches ( in the 
neighborhood of a few 100us) for flag to be set.

The 2000us timeout number was my swag at a number that would give an 
order of magnitude more time to cover the race, but minimize the 
downside to lldb in case I had not anticipated some condition where it 
would actually hit the timeout.  I never saw it hitting the timeout, so 
it could theoretically be much larger.

Shawn.
On 8/4/2014 10:01 PM, Matthew Gardiner wrote:
> Thanks for this update, Jim!
>
> Hopefully, we'll have it all fixed for when he returns ;-)
>
> Matt
>
>
> jingham at apple.com wrote:
>> Note, Greg is on vacation this week. Dunno if he'll chime in from the 
>> beach, but he's sure to have opinions.  Just wanted to make sure 
>> folks didn't think his silence indicated he didn't...
>>
>> Jim
>>
>>> On Aug 4, 2014, at 5:41 AM, Matthew Gardiner <mg11 at csr.com> wrote:
>>>
>>> Folks,
>>>
>>> I have been testing Shawn's patch some more today.
>>>
>>> (lldb) target create 
>>> ~/src/main/devtools/main/heracles/csrgdbserver/test/examples/simple/i64-hello.elf
>>> ...
>>> (lldb) process launch -s
>>> ...
>>> lldb) process continue
>>> ... lots of output ...
>>>
>>> Control-c pressed
>>> (lldb)
>>>
>>> Above I found that if I launch the process with "stop at entry", 
>>> then I continue, then when I control-c to interrupt the prompt 
>>> returns correctly. With the above session I found I could process 
>>> continue and Ctrl-c repeatedly and the prompt would return as 
>>> expected. I also found that using "next" to step the inferior also 
>>> resulted in the debugger's prompt returning.
>>>
>>> However, if I launched without -s
>>>
>>> (lldb) target create 
>>> ~/src/main/devtools/main/heracles/csrgdbserver/test/examples/simple/i64-hello.elf
>>> ...
>>> (lldb) process launch
>>> ... lots of output ...
>>>
>>> Control-c pressed
>>>
>>> then the prompt was absent, and hence broken.
>>>
>>> Also if I create my target, set a breakpoint, then launch:
>>>
>>> (lldb) target create 
>>> ~/src/main/devtools/main/heracles/csrgdbserver/test/examples/simple/i64-hello.elf
>>> ...
>>> (lldb) br s -f forever.c -l 12
>>> ...
>>> (lldb) process launch
>>> Process 10292 launching
>>> Process 10292 launched: 
>>> '/home/mg11/src/main/devtools/main/heracles/csrgdbserver/test/examples/simple/i64-hello.elf' 
>>> (x86_64)
>>> Process 10292 stopped
>>> * thread #1: tid = 10292, 0x0000003675a011f0, name = 'i64-hello.elf'
>>>     frame #0:
>>> Process 10292 stopped
>>> * thread #1: tid = 10292, 0x0000000000400548 i64-hello.elf`f2(x=8, 
>>> y=0) + 24 at forever.c:12, name = 'i64-hello.elf', stop reason = 
>>> breakpoint 1.1
>>>     frame #0: 0x0000000000400548 i64-hello.elf`f2(x=8, y=0) + 24 at 
>>> forever.c:12
>>>    9
>>>    10          x += 8;
>>>    11          y *= x;
>>> -> 12          buff[0] = y;
>>>    13
>>>    14          x = buff[0] + buff[1];
>>>    15          return x;
>>>
>>> Again, no prompt.
>>>
>>> Attaching to a stopped process using gdb-remote, also results in me 
>>> having no prompt:
>>>
>>> (lldb) target create 
>>> ~/src/main/devtools/main/heracles/csrgdbserver/test/examples/simple/i64-hello.elf
>>> ...
>>> (lldb) gdb-remote 7777
>>> Process 10327 stopped
>>> * thread #1: tid = 10327, 0x0000003675a011f0, name = 
>>> 'i64-hello.elf', stop reason = signal SIGSTOP
>>>     frame #0: 0x0000003675a011f0
>>> -> 0x3675a011f0:  movq   %rsp, %rdi
>>>    0x3675a011f3:  callq  0x3675a046e0
>>>    0x3675a011f8:  movq   %rax, %r12
>>>    0x3675a011fb:  movl   0x21eb97(%rip), %eax
>>>
>>> So, as I alluded to previously, there are a lot of scenarios where 
>>> the IOHandler prompt bug manifests. Now that I have read the code, 
>>> I'm surprised that it does not happen more often and on more 
>>> architectures. Out of interest my test inferior basically spins 
>>> producing lots of output. It *may* one reason why I can provoke 
>>> these bugs so easily. I'm attaching my test code if this helps 
>>> others witness all of the issues which I'm seeing. I'll see if there 
>>> are any simple tweaks I can make (in the spirit of Shawn's patch) to 
>>> fix the above bugs.
>>>
>>> Matt
>>>
>>>
>>> Matthew Gardiner wrote:
>>>> "I think the last suggestion of moving the sync point to 
>>>> Debugger::HandleProcessEvent() would defeat the purpose of the 
>>>> patch since it is called from the EventHandler thread.  The race 
>>>> condition is the main thread returning up the call stack to start 
>>>> another round of commandIO handling before PushProcessIOHandler() 
>>>> gets called."
>>>>
>>>> I agree with this remark of Shawn's. The whole point of this issue 
>>>> is to block the main thread i.e. the one which calls 
>>>> IOHandlerEditline::GetLine, which is where the (lldb) prompt is 
>>>> printed, i.e.
>>>>
>>>> #0  lldb_private::IOHandlerEditline::GetLine
>>>> #1  0x00007ffff770aceb in lldb_private::IOHandlerEditline::Run
>>>> #2  0x00007ffff76f546a in lldb_private::Debugger::ExecuteIOHanders
>>>> #3  0x00007ffff77d78bb in 
>>>> lldb_private::CommandInterpreter::RunCommandInterpreter
>>>> #4  0x00007ffff7a47955 in lldb::SBDebugger::RunCommandInterpreter
>>>> #5  0x0000000000409234 in Driver::MainLoop
>>>> #6  0x0000000000409572 in main
>>>>
>>>> I spent quite a while looking at this problem and the surrounding 
>>>> architecture this morning. The observed problem occurs when the 
>>>> Editline's IOHandler object does not have it's m_active flag 
>>>> cleared before it gets it's Run method called again (i.e. after the 
>>>> last user command is processed). So we have to block this main 
>>>> thread until another IOHandler can be pushed (i.e. the process 
>>>> one). It is the act of pushing an IOHandler which has the effect of 
>>>> clearing the m_active flag in the previous topmost handler.
>>>>
>>>> Regarding Greg's comment:
>>>>
>>>> "not having to have every command add a call to 
>>>> process->WaitForPushedIOHandlerSync(...)."
>>>>
>>>> The problem is that it is only the individual "Command" object 
>>>> implementations that know whether asynchronous processing will 
>>>> occur, and hence whether this "block main thread until IOHandler 
>>>> pushed" action is necessary. And therefore we may need to add this 
>>>> kind of logic piece-meal. It's not great, but with my limited view 
>>>> of lldb's architecture, I can't currently think of a better way.
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>> Shawn Best wrote:
>>>>> Thanks for the feedback Greg.  I have attached a revised patch 
>>>>> based on your suggestions.
>>>>>
>>>>> - renamed m_pushed_IOHandler to m_iohandler_sync
>>>>>
>>>>> - got rid of method ClearPushedIOHandlerSync() and make calls 
>>>>> directly to m_iohandler_sync.SetValue(false..)
>>>>>
>>>>> - renamed WaitForPushedIOHandlerSync() to SyncIOHandler()
>>>>>
>>>>> - only wait in case where m_process_input_reader != NULL
>>>>>
>>>>> I put the calls to reset the sync flag in both PrivateEventThread 
>>>>> (after it has seen a public stop), and after the call to 
>>>>> SyncIOHandler() is completed.  As far as I can tell it should work 
>>>>> fine, but my preference is normally to reset the flag immediately 
>>>>> before using it instead of relying on it set to false.  Having it 
>>>>> internal does clean up the code though.
>>>>>
>>>>> I think the last suggestion of moving the sync point to 
>>>>> Debugger::HandleProcessEvent() would defeat the purpose of the 
>>>>> patch since it is called from the EventHandler thread.  The race 
>>>>> condition is the main thread returning up the call stack to start 
>>>>> another round of commandIO handling before PushProcessIOHandler() 
>>>>> gets called.
>>>>>
>>>>>
>>>>> On Fri, Aug 1, 2014 at 10:39 AM, Greg Clayton <gclayton at apple.com 
>>>>> <mailto:gclayton at apple.com>> wrote:
>>>>>
>>>>>     Comments:
>>>>>
>>>>>     Why should anyone outside of lldb_private::Process have to call
>>>>>     ClearPushedIOHandlerSync() manually? Can we get rid of this
>>>>>     function and just have Process.cpp do it at the right times by
>>>>>     directly calling m_pushed_IOHandler.SetValue(false, 
>>>>> eBroadcastNever);?
>>>>>
>>>>>     If so then the following fixes apply:
>>>>>     1 - remove Process::ClearPushedIOHandlerSync() since it will be
>>>>>     done internally within process.
>>>>>     2 - rename m_pushed_IOHandler to m_iohandler_sync
>>>>>     3 - rename WaitForPushedIOHandlerSync() to SyncIOHandler
>>>>>
>>>>>     Other general fixes:
>>>>>     1 - the WaitForPushedIOHandlerSync should do nothing if there is
>>>>>     no process IOHandler (no stdio or we attached to a process. This
>>>>>     can be done by testing m_process_input_reader with "if
>>>>>     (m_process_input_reader) { ... }"
>>>>>
>>>>>     I would also like the fix this sync issue by not having to have
>>>>>     every command add a call to
>>>>>     process->WaitForPushedIOHandlerSync(...). Can't we sync this in
>>>>>     the Debugger::HandleProcessEvent()?
>>>>>
>>>>>     > On Jul 31, 2014, at 2:57 PM, Shawn Best <sbest at blueshiftinc.com
>>>>>     <mailto:sbest at blueshiftinc.com>> wrote:
>>>>>     >
>>>>>     > oops, with the attachment this time.
>>>>>     >
>>>>>     >
>>>>>     > On Thu, Jul 31, 2014 at 2:56 PM, Shawn Best
>>>>>     <sbest at blueshiftinc.com <mailto:sbest at blueshiftinc.com>> wrote:
>>>>>     > Thanks everyone for the feedback.  I have reworked the patch to
>>>>>     use Predicate <bool>, it reads much cleaner now.
>>>>>     >
>>>>>     > Shawn.
>>>>>     >
>>>>>     > On Wed, Jul 30, 2014 at 6:34 PM, Greg Clayton
>>>>>     <gclayton at apple.com <mailto:gclayton at apple.com>> wrote:
>>>>>     > You will want to use a Predicate<bool> here in stead of what 
>>>>> you
>>>>>     have since it is exactly what we use a predicate for. The 
>>>>> following:
>>>>>     >
>>>>>     > +    bool  m_process_running_sync;         // used with
>>>>>     WaitForProcessRunning() synchronization
>>>>>     > +    std::condition_variable m_condition_process_running;    //
>>>>>     used with WaitForProcessRunning() synchronization
>>>>>     > +    std::mutex  m_mutex_process_running; // used with
>>>>>     WaitForProcessRunning() synchronization
>>>>>     >
>>>>>     > Is exactly what the Predicate class does: protect a value 
>>>>> with a
>>>>>     mutex and condition.
>>>>>     >
>>>>>     > The above code should be replaced with:
>>>>>     >
>>>>>     >      Predicate<bool> m_process_running_sync;
>>>>>     >
>>>>>     > The API on Predicate should do what you want. See the header
>>>>>     file at "lldb/Host/Predicate.h" and also look for other places
>>>>>     that use this class to wait for a value to be equal to another
>>>>>     value, or wait for a value to not be equal to something.
>>>>>     >
>>>>>     > Let me know when you have a patch that uses Predicate and we
>>>>>     will look at that.
>>>>>     >
>>>>>     > Greg
>>>>>     >
>>>>>     > > On Jul 30, 2014, at 4:03 PM, Shawn Best
>>>>>     <sbest at blueshiftinc.com <mailto:sbest at blueshiftinc.com>> wrote:
>>>>>     > >
>>>>>     > > I have reworked the patch to use std::condition_variable.
>>>>>      This particular sync mechanism was new to me, I hope I used it
>>>>>     correctly.  Is it portable across all target platforms/compilers?
>>>>>      I tested on linux and OSX.
>>>>>     > >
>>>>>     > > The timeout is pretty small (1ms) but seems ample based on 
>>>>> the
>>>>>     measurements I made.
>>>>>     > >
>>>>>     > >
>>>>>     > > On Tue, Jul 29, 2014 at 9:58 PM, Matthew Gardiner
>>>>>     <mg11 at csr.com <mailto:mg11 at csr.com>> wrote:
>>>>>     > > Cool, let us know how you get on!
>>>>>     > > Matt
>>>>>     > >
>>>>>     > > Shawn Best wrote:
>>>>>     > > Thanks for the feedback guys.
>>>>>     > >
>>>>>     > > Studying the code, I had figured going with a straight int
>>>>>     would in practice be most efficient and not run into
>>>>>     multi-threaded problems, even if initially appearing a bit risky.
>>>>>      I will rework it to use a std::condition_variable. That will be
>>>>>     more robust and readable.
>>>>>     > >
>>>>>     > > Shawn.
>>>>>     > >
>>>>>     > > On 7/29/2014 10:53 AM, Zachary Turner wrote:
>>>>>     > > Even better would be an std::condition_variable
>>>>>     > >
>>>>>     > >
>>>>>     > > On Mon, Jul 28, 2014 at 10:30 PM, Matthew Gardiner
>>>>>     <mg11 at csr.com <mailto:mg11 at csr.com> <mailto:mg11 at csr.com
>>>>>     <mailto:mg11 at csr.com>>> wrote:
>>>>>     > >
>>>>>     > >     Hi Shawn,
>>>>>     > >
>>>>>     > >     I use 64-bit linux and I see this issue a lot. It usually
>>>>>     > >     manifests itself as the prompt just not being printed (or
>>>>>     perhaps
>>>>>     > >     it just gets overwritten) - regardless - I invoke a
>>>>>     command, and
>>>>>     > >     I don't see an (lldb) prompt when I should. So I'm well
>>>>>     pleased
>>>>>     > >     that you are looking at this!
>>>>>     > >
>>>>>     > >     Would it not be more robust to use a semaphore than 
>>>>> usleep to
>>>>>     > >     synchronise the problematic threads?
>>>>>     > >
>>>>>     > >     Although I've not looked too deeply into this particular
>>>>>     issue,
>>>>>     > >     whenever I've seen similar races, I found that it's 
>>>>> almost
>>>>>     > >     impossible to pick the right value when using a sleep
>>>>>     command. A
>>>>>     > >     semaphore, though, should always ensure the waiting 
>>>>> thread
>>>>>     will
>>>>>     > >     wake precisely.
>>>>>     > >
>>>>>     > >     I'd be happy to help to test such a fix.
>>>>>     > >
>>>>>     > >     Matt
>>>>>     > >
>>>>>     > >
>>>>>     > >     Shawn Best wrote:
>>>>>     > >
>>>>>     > >         Hi,
>>>>>     > >
>>>>>     > >         I have attached a patch which addresses 3 related 
>>>>> race
>>>>>     > >         conditions that cause the command line (lldb) prompt
>>>>>     to get
>>>>>     > >         displayed inappropriately and make it appear it is 
>>>>> not
>>>>>     > >         working correctly.  This issue can be seen on 
>>>>> linux and
>>>>>     > >         FreeBSD.  I can also artificailly induce the problem
>>>>>     on OSX.
>>>>>     > >
>>>>>     > >         The issue happens when the command handler (in the 
>>>>> main
>>>>>     > >         thread) issues a command such as run, step or 
>>>>> continue.
>>>>>     > >          After the command finishes initiating its action, it
>>>>>     returns
>>>>>     > >         up the call stack and goes back into the main 
>>>>> command loop
>>>>>     > >         waiting for user input. Simultaneously, as the 
>>>>> inferior
>>>>>     > >         process starts up, the MonitorChildProcess thread 
>>>>> picks up
>>>>>     > >         the change and posts to the PrivateEvent thread.
>>>>>     > >          HandePrivateEvent() then calls 
>>>>> PushProcessIOHandler()
>>>>>     which
>>>>>     > >         will disable the command IO handler and give the 
>>>>> inferior
>>>>>     > >         control of the TTY.  To observe this on OSX, put a
>>>>>     > >
>>>>>     > >         usleep(100);
>>>>>     > >
>>>>>     > >         immediately prior the PushProcessIOHandler() in
>>>>>     > >         HandlePrivateEvent.
>>>>>     > >
>>>>>     > >
>>>>>     > >         My proposed solution is that after a 'run', 
>>>>> 'step', or
>>>>>     > >         'continue' command, insert a synchronization point 
>>>>> and
>>>>>     wait
>>>>>     > >         until HandlePrivateEvent knows the inferior 
>>>>> process is
>>>>>     > >         running and has pushed the IO handler. One context 
>>>>> switch
>>>>>     > >         (<100us) is usually all the time it takes on my
>>>>>     machine.  As
>>>>>     > >         an additional safety, I have a timeout (currently 
>>>>> 1ms)
>>>>>     so it
>>>>>     > >         will never hang the main thread.
>>>>>     > >
>>>>>     > >         Any thoughts, or suggestions would be appreciated.
>>>>>     > >
>>>>>     > >         Regards,
>>>>>     > >         Shawn.
>>>>>     > >
>>>>>     > >
>>>>>     > >         To report this email as spam click here
>>>>>     > > <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==>.
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > > _______________________________________________
>>>>>     > >         lldb-commits mailing list
>>>>>     > > lldb-commits at cs.uiuc.edu <mailto:lldb-commits at cs.uiuc.edu>
>>>>>     <mailto:lldb-commits at cs.uiuc.edu 
>>>>> <mailto:lldb-commits at cs.uiuc.edu>>
>>>>>     > >
>>>>>     > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > >     Member of the CSR plc group of companies. CSR plc
>>>>>     registered in
>>>>>     > >     England and Wales, registered number 4187346, registered
>>>>>     office
>>>>>     > >     Churchill House, Cambridge Business Park, Cowley Road,
>>>>>     Cambridge,
>>>>>     > >     CB4 0WZ, United Kingdom
>>>>>     > >     More information can be found at www.csr.com
>>>>>     <http://www.csr.com>
>>>>>     > >     <http://www.csr.com>. Keep up to date with CSR on our
>>>>>     technical
>>>>>     > >     blog, www.csr.com/blog <http://www.csr.com/blog>
>>>>>     <http://www.csr.com/blog>, CSR people
>>>>>     > >     blog, www.csr.com/people <http://www.csr.com/people>
>>>>>     <http://www.csr.com/people>, YouTube,
>>>>>     > > www.youtube.com/user/CSRplc
>>>>>     <http://www.youtube.com/user/CSRplc>
>>>>>     <http://www.youtube.com/user/CSRplc>,
>>>>>     > >     Facebook, www.facebook.com/pages/CSR/191038434253534
>>>>> <http://www.facebook.com/pages/CSR/191038434253534>
>>>>>     > > <http://www.facebook.com/pages/CSR/191038434253534>, or
>>>>>     follow us
>>>>>     > >     on Twitter at www.twitter.com/CSR_plc
>>>>>     <http://www.twitter.com/CSR_plc>
>>>>>     > >     <http://www.twitter.com/CSR_plc>.
>>>>>     > >
>>>>>     > >     New for 2014, you can now access the wide range of 
>>>>> products
>>>>>     > >     powered by aptX at www.aptx.com <http://www.aptx.com>
>>>>>     <http://www.aptx.com>.
>>>>>     > > _______________________________________________
>>>>>     > >     lldb-commits mailing list
>>>>>     > > lldb-commits at cs.uiuc.edu <mailto:lldb-commits at cs.uiuc.edu>
>>>>>     <mailto:lldb-commits at cs.uiuc.edu 
>>>>> <mailto:lldb-commits at cs.uiuc.edu>>
>>>>>     > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>>     > >
>>>>> <sbest_iohandler_race_rev_04.diff>_______________________________________________ 
>>>>>
>>>>>     > > lldb-commits mailing list
>>>>>     > > lldb-commits at cs.uiuc.edu <mailto:lldb-commits at cs.uiuc.edu>
>>>>>     > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > <sbest_iohandler_race_rev_05.diff>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> lldb-commits mailing list
>>>> lldb-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>>> <forever.c>_______________________________________________
>>> lldb-commits mailing list
>>> lldb-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>




More information about the lldb-commits mailing list