[Lldb-commits] [lldb] r254403 - Fix race during process interruption

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Tue Dec 1 03:28:48 PST 2015


Author: labath
Date: Tue Dec  1 05:28:47 2015
New Revision: 254403

URL: http://llvm.org/viewvc/llvm-project?rev=254403&view=rev
Log:
Fix race during process interruption

Summary:
The following situation was occuring in TestAttachResume:
- we did a "continue" from a breakpoint (which involves a private start-stop to step over the
  breakpoint)
- after receiving the stop-reply from the step-over, we issue a "detach" (which requires a
  process interrupt)
- at this moment, the public state is "running", private state is "about-to-be-stopped" (the
  stopped event was broadcast, but it was not received yet)
- StopForDestroyOrDetach (public thread) notes the public state is running, sends an interrupt
  request to the private thread
- private thread gets the eBroadcastBitInterrupt (before the eStateStopped message), and asks the
  process plugin to stop (via Halt())
- process plugin says it has nothing to do as the process is already stopped
- private thread shrugs and carries on. receives the stop event, restores the breakpoint and
  resumes the process.
- after a while, the public thread times out and says it failed to stop the process

This patch does the following:
- splits Halt() into two functions, private and public, their usage depends on the context
  - public Halt(): sends eBroadcastBitInterrupt to the private thread and waits for the Stop
    event
  - HaltPrivate(): asks the plugin to stop and makes a note that the halt was requested. When the
    next stop event comes it sets the interrupt flag on it.
- removes HijackPrivateProcessEvents(), as the only user (old Halt()) has gone away
- removes the m_currently_handling_event hack, as the new Halt() does not need it
- adds a use_run_lock parameter to public Halt() and WaitForProcessToStop(). This was needed
  because RunThreadPlan uses Halt() while holding the run lock and we don't want Halt() to take
  it away from him.

Reviewers: clayborg, jingham

Subscribers: lldb-commits

Differential Revision: http://reviews.llvm.org/D14989

Modified:
    lldb/trunk/include/lldb/Target/Process.h
    lldb/trunk/packages/Python/lldbsuite/test/expression_command/timeout/TestCallWithTimeout.py
    lldb/trunk/packages/Python/lldbsuite/test/functionalities/attach_resume/TestAttachResume.py
    lldb/trunk/source/Target/Process.cpp

Modified: lldb/trunk/include/lldb/Target/Process.h
URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/include/lldb/Target/Process.h?rev=254403&r1=254402&r2=254403&view=diff
==============================================================================
--- lldb/trunk/include/lldb/Target/Process.h (original)
+++ lldb/trunk/include/lldb/Target/Process.h Tue Dec  1 05:28:47 2015
@@ -1355,6 +1355,7 @@ public:
 
     Error
     ResumeSynchronous (Stream *stream);
+
     //------------------------------------------------------------------
     /// Halts a running process.
     ///
@@ -1367,12 +1368,15 @@ public:
     /// @param[in] clear_thread_plans
     ///     If true, when the process stops, clear all thread plans.
     ///
+    /// @param[in] use_run_lock
+    ///     Whether to release the run lock after the stop.
+    ///
     /// @return
     ///     Returns an error object.  If the error is empty, the process is halted.
     ///     otherwise the halt has failed.
     //------------------------------------------------------------------
     Error
-    Halt (bool clear_thread_plans = false);
+    Halt (bool clear_thread_plans = false, bool use_run_lock = true);
 
     //------------------------------------------------------------------
     /// Detaches from a running or stopped process.
@@ -1683,9 +1687,8 @@ public:
     /// DoHalt must produce one and only one stop StateChanged event if it actually
     /// stops the process.  If the stop happens through some natural event (for
     /// instance a SIGSTOP), then forwarding that event will do.  Otherwise, you must 
-    /// generate the event manually.  Note also, the private event thread is stopped when 
-    /// DoHalt is run to prevent the events generated while halting to trigger
-    /// other state changes before the halt is complete.
+    /// generate the event manually. This function is called from the context of the
+    /// private state thread.
     ///
     /// @param[out] caused_stop
     ///     If true, then this Halt caused the stop, otherwise, the 
@@ -2861,12 +2864,16 @@ public:
     // Returns the process state when it is stopped. If specified, event_sp_ptr
     // is set to the event which triggered the stop. If wait_always = false,
     // and the process is already stopped, this function returns immediately.
+    // If the process is hijacked and use_run_lock is true (the default), then this
+    // function releases the run lock after the stop. Setting use_run_lock to false
+    // will avoid this behavior.
     lldb::StateType
     WaitForProcessToStop(const TimeValue *timeout,
                          lldb::EventSP *event_sp_ptr = nullptr,
                          bool wait_always = true,
                          Listener *hijack_listener = nullptr,
-                         Stream *stream = nullptr);
+                         Stream *stream = nullptr,
+                         bool use_run_lock = true);
 
     uint32_t
     GetIOHandlerID () const
@@ -3281,12 +3288,6 @@ protected:
         std::string m_exit_string;
     };
 
-    bool 
-    HijackPrivateProcessEvents (Listener *listener);
-    
-    void 
-    RestorePrivateProcessEvents ();
-    
     bool
     PrivateStateThreadIsValid () const
     {
@@ -3372,7 +3373,6 @@ protected:
     std::vector<PreResumeCallbackAndBaton> m_pre_resume_actions;
     ProcessRunLock              m_public_run_lock;
     ProcessRunLock              m_private_run_lock;
-    Predicate<bool>             m_currently_handling_event; // This predicate is set in HandlePrivateEvent while all its business is being done.
     ArchSpec::StopInfoOverrideCallbackType m_stop_info_override_callback;
     bool                        m_currently_handling_do_on_removals;
     bool                        m_resume_requested;         // If m_currently_handling_event or m_currently_handling_do_on_removals are true, Resume will only request a resume, using this flag to check.
@@ -3436,6 +3436,9 @@ protected:
     void
     HandlePrivateEvent (lldb::EventSP &event_sp);
 
+    Error
+    HaltPrivate();
+
     lldb::StateType
     WaitForProcessStopPrivate (const TimeValue *timeout, lldb::EventSP &event_sp);
 

Modified: lldb/trunk/packages/Python/lldbsuite/test/expression_command/timeout/TestCallWithTimeout.py
URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/packages/Python/lldbsuite/test/expression_command/timeout/TestCallWithTimeout.py?rev=254403&r1=254402&r2=254403&view=diff
==============================================================================
--- lldb/trunk/packages/Python/lldbsuite/test/expression_command/timeout/TestCallWithTimeout.py (original)
+++ lldb/trunk/packages/Python/lldbsuite/test/expression_command/timeout/TestCallWithTimeout.py Tue Dec  1 05:28:47 2015
@@ -57,7 +57,7 @@ class ExprCommandWithTimeoutsTestCase(Te
 
         frame = thread.GetFrameAtIndex(0)
         
-        value = frame.EvaluateExpression ("wait_a_while (50000)", options)
+        value = frame.EvaluateExpression ("wait_a_while (200000)", options)
         self.assertTrue (value.IsValid())
         self.assertFalse (value.GetError().Success())
 
@@ -65,7 +65,7 @@ class ExprCommandWithTimeoutsTestCase(Te
         interp = self.dbg.GetCommandInterpreter()
 
         result = lldb.SBCommandReturnObject()
-        return_value = interp.HandleCommand ("expr -t 100 -u true -- wait_a_while(50000)", result)
+        return_value = interp.HandleCommand ("expr -t 100 -u true -- wait_a_while(200000)", result)
         self.assertTrue (return_value == lldb.eReturnStatusFailed)
 
         # Okay, now do it again with long enough time outs:

Modified: lldb/trunk/packages/Python/lldbsuite/test/functionalities/attach_resume/TestAttachResume.py
URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/packages/Python/lldbsuite/test/functionalities/attach_resume/TestAttachResume.py?rev=254403&r1=254402&r2=254403&view=diff
==============================================================================
--- lldb/trunk/packages/Python/lldbsuite/test/functionalities/attach_resume/TestAttachResume.py (original)
+++ lldb/trunk/packages/Python/lldbsuite/test/functionalities/attach_resume/TestAttachResume.py Tue Dec  1 05:28:47 2015
@@ -20,7 +20,6 @@ class AttachResumeTestCase(TestBase):
     @skipIfRemote
     @expectedFailureFreeBSD('llvm.org/pr19310')
     @expectedFailureWindows("llvm.org/pr24778")
-    @expectedFlakeyLinux('llvm.org/pr19310')
     def test_attach_continue_interrupt_detach(self):
         """Test attach/continue/interrupt/detach"""
         self.build()
@@ -52,6 +51,9 @@ class AttachResumeTestCase(TestBase):
         self.runCmd("process interrupt")
         lldbutil.expect_state_changes(self, listener, [lldb.eStateStopped])
 
+        # Second interrupt should have no effect.
+        self.expect("process interrupt", patterns=["Process is not running"], error=True)
+
         # check that this breakpoint is auto-cleared on detach (r204752)
         self.runCmd("br set -f main.cpp -l %u" % (line_number('main.cpp', '// Set breakpoint here')))
 

Modified: lldb/trunk/source/Target/Process.cpp
URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/source/Target/Process.cpp?rev=254403&r1=254402&r2=254403&view=diff
==============================================================================
--- lldb/trunk/source/Target/Process.cpp (original)
+++ lldb/trunk/source/Target/Process.cpp Tue Dec  1 05:28:47 2015
@@ -759,7 +759,6 @@ Process::Process(lldb::TargetSP target_s
     m_next_event_action_ap(),
     m_public_run_lock (),
     m_private_run_lock (),
-    m_currently_handling_event(false),
     m_stop_info_override_callback (NULL),
     m_finalizing (false),
     m_finalize_called (false),
@@ -992,7 +991,8 @@ Process::WaitForProcessToStop (const Tim
                                EventSP *event_sp_ptr,
                                bool wait_always,
                                Listener *hijack_listener,
-                               Stream *stream)
+                               Stream *stream,
+                               bool use_run_lock)
 {
     // We can't just wait for a "stopped" event, because the stopped event may have restarted the target.
     // We have to actually check each event, and in the case of a stopped event check the restarted flag
@@ -1019,7 +1019,7 @@ Process::WaitForProcessToStop (const Tim
                         __FUNCTION__);
         // We need to toggle the run lock as this won't get done in
         // SetPublicState() if the process is hijacked.
-        if (hijack_listener)
+        if (hijack_listener && use_run_lock)
             m_public_run_lock.SetStopped();
         return state;
     }
@@ -1042,7 +1042,7 @@ Process::WaitForProcessToStop (const Tim
         case eStateUnloaded:
             // We need to toggle the run lock as this won't get done in
             // SetPublicState() if the process is hijacked.
-            if (hijack_listener)
+            if (hijack_listener && use_run_lock)
                 m_public_run_lock.SetStopped();
             return state;
         case eStateStopped:
@@ -1052,7 +1052,7 @@ Process::WaitForProcessToStop (const Tim
             {
                 // We need to toggle the run lock as this won't get done in
                 // SetPublicState() if the process is hijacked.
-                if (hijack_listener)
+                if (hijack_listener && use_run_lock)
                     m_public_run_lock.SetStopped();
                 return state;
             }
@@ -1308,23 +1308,6 @@ Process::RestoreProcessEvents ()
     RestoreBroadcaster();
 }
 
-bool
-Process::HijackPrivateProcessEvents (Listener *listener)
-{
-    if (listener != NULL)
-    {
-        return m_private_state_broadcaster.HijackBroadcaster(listener, eBroadcastBitStateChanged | eBroadcastBitInterrupt);
-    }
-    else
-        return false;
-}
-
-void
-Process::RestorePrivateProcessEvents ()
-{
-    m_private_state_broadcaster.RestoreBroadcaster();
-}
-
 StateType
 Process::WaitForStateChangedEvents (const TimeValue *timeout, EventSP &event_sp, Listener *hijack_listener)
 {
@@ -3831,101 +3814,50 @@ Process::PrivateResume ()
 }
 
 Error
-Process::Halt (bool clear_thread_plans)
+Process::Halt (bool clear_thread_plans, bool use_run_lock)
 {
+    if (! StateIsRunningState(m_public_state.GetValue()))
+        return Error("Process is not running.");
+
     // Don't clear the m_clear_thread_plans_on_stop, only set it to true if
     // in case it was already set and some thread plan logic calls halt on its
     // own.
     m_clear_thread_plans_on_stop |= clear_thread_plans;
     
-    // First make sure we aren't in the middle of handling an event, or we might restart.  This is pretty weak, since
-    // we could just straightaway get another event.  It just narrows the window...
-    m_currently_handling_event.WaitForValueEqualTo(false);
-
-    
-    // Pause our private state thread so we can ensure no one else eats
-    // the stop event out from under us.
     Listener halt_listener ("lldb.process.halt_listener");
-    HijackPrivateProcessEvents(&halt_listener);
+    HijackProcessEvents(&halt_listener);
 
     EventSP event_sp;
-    Error error (WillHalt());
     
-    bool restored_process_events = false;
-    if (error.Success())
+    SendAsyncInterrupt();
+
+    if (m_public_state.GetValue() == eStateAttaching)
     {
-        
-        bool caused_stop = false;
-        
-        // Ask the process subclass to actually halt our process
-        error = DoHalt(caused_stop);
-        if (error.Success())
-        {
-            if (m_public_state.GetValue() == eStateAttaching)
-            {
-                // Don't hijack and eat the eStateExited as the code that was doing
-                // the attach will be waiting for this event...
-                RestorePrivateProcessEvents();
-                restored_process_events = true;
-                SetExitStatus(SIGKILL, "Cancelled async attach.");
-                Destroy (false);
-            }
-            else
-            {
-                // If "caused_stop" is true, then DoHalt stopped the process. If
-                // "caused_stop" is false, the process was already stopped.
-                // If the DoHalt caused the process to stop, then we want to catch
-                // this event and set the interrupted bool to true before we pass
-                // this along so clients know that the process was interrupted by
-                // a halt command.
-                if (caused_stop)
-                {
-                    // Wait for 1 second for the process to stop.
-                    TimeValue timeout_time;
-                    timeout_time = TimeValue::Now();
-                    timeout_time.OffsetWithSeconds(10);
-                    bool got_event = halt_listener.WaitForEvent (&timeout_time, event_sp);
-                    StateType state = ProcessEventData::GetStateFromEvent(event_sp.get());
-                    
-                    if (!got_event || state == eStateInvalid)
-                    {
-                        // We timeout out and didn't get a stop event...
-                        error.SetErrorStringWithFormat ("Halt timed out. State = %s", StateAsCString(GetState()));
-                    }
-                    else
-                    {
-                        if (StateIsStoppedState (state, false))
-                        {
-                            // We caused the process to interrupt itself, so mark this
-                            // as such in the stop event so clients can tell an interrupted
-                            // process from a natural stop
-                            ProcessEventData::SetInterruptedInEvent (event_sp.get(), true);
-                        }
-                        else
-                        {
-                            Log *log(lldb_private::GetLogIfAllCategoriesSet (LIBLLDB_LOG_PROCESS));
-                            if (log)
-                                log->Printf("Process::Halt() failed to stop, state is: %s", StateAsCString(state));
-                            error.SetErrorString ("Did not get stopped event after halt.");
-                        }
-                    }
-                }
-                DidHalt();
-            }
-        }
+        // Don't hijack and eat the eStateExited as the code that was doing
+        // the attach will be waiting for this event...
+        RestoreProcessEvents();
+        SetExitStatus(SIGKILL, "Cancelled async attach.");
+        Destroy (false);
+        return Error();
     }
-    // Resume our private state thread before we post the event (if any)
-    if (!restored_process_events)
-        RestorePrivateProcessEvents();
 
-    // Post any event we might have consumed. If all goes well, we will have
-    // stopped the process, intercepted the event and set the interrupted
-    // bool in the event.  Post it to the private event queue and that will end up
-    // correctly setting the state.
-    if (event_sp)
-        m_private_state_broadcaster.BroadcastEvent(event_sp);
+    // Wait for 10 second for the process to stop.
+    TimeValue timeout_time;
+    timeout_time = TimeValue::Now();
+    timeout_time.OffsetWithSeconds(10);
+    StateType state = WaitForProcessToStop(&timeout_time, &event_sp, true, &halt_listener,
+                                           nullptr, use_run_lock);
+    RestoreProcessEvents();
 
-    return error;
+    if (state == eStateInvalid || ! event_sp)
+    {
+        // We timed out and didn't get a stop event...
+        return Error("Halt timed out. State = %s", StateAsCString(GetState()));
+    }
+
+    BroadcastEvent(event_sp);
+
+    return Error();
 }
 
 Error
@@ -4461,8 +4393,6 @@ Process::HandlePrivateEvent (EventSP &ev
     Log *log(lldb_private::GetLogIfAllCategoriesSet (LIBLLDB_LOG_PROCESS));
     m_resume_requested = false;
     
-    m_currently_handling_event.SetValue(true, eBroadcastNever);
-    
     const StateType new_state = Process::ProcessEventData::GetStateFromEvent(event_sp.get());
     
     // First check to see if anybody wants a shot at this event:
@@ -4489,7 +4419,6 @@ Process::HandlePrivateEvent (EventSP &ev
                 {
                     // FIXME: should cons up an exited event, and discard this one.
                     SetExitStatus(0, m_next_event_action_ap->GetExitString());
-                    m_currently_handling_event.SetValue(false, eBroadcastAlways);
                     SetNextEventAction(NULL);
                     return;
                 }
@@ -4579,7 +4508,22 @@ Process::HandlePrivateEvent (EventSP &ev
                          StateAsCString (GetState ()));
         }
     }
-    m_currently_handling_event.SetValue(false, eBroadcastAlways);
+}
+
+Error
+Process::HaltPrivate()
+{
+    EventSP event_sp;
+    Error error (WillHalt());
+    if (error.Fail())
+        return error;
+
+    // Ask the process subclass to actually halt our process
+    bool caused_stop;
+    error = DoHalt(caused_stop);
+
+    DidHalt();
+    return error;
 }
 
 thread_result_t
@@ -4602,6 +4546,7 @@ Process::RunPrivateStateThread (bool is_
                      __FUNCTION__, static_cast<void*>(this), GetID());
 
     bool exit_now = false;
+    bool interrupt_requested = false;
     while (!exit_now)
     {
         EventSP event_sp;
@@ -4647,7 +4592,16 @@ Process::RunPrivateStateThread (bool is_
                     log->Printf ("Process::%s (arg = %p, pid = %" PRIu64 ") woke up with an interrupt - Halting.",
                                  __FUNCTION__, static_cast<void*>(this),
                                  GetID());
-                Halt();
+                Error error = HaltPrivate();
+                if (error.Fail() && log)
+                    log->Printf ("Process::%s (arg = %p, pid = %" PRIu64 ") failed to halt the process: %s",
+                                 __FUNCTION__, static_cast<void*>(this),
+                                 GetID(), error.AsCString());
+                // Halt should generate a stopped event. Make a note of the fact that we were
+                // doing the interrupt, so we can set the interrupted flag after we receive the
+                // event. We deliberately set this to true even if HaltPrivate failed, so that we
+                // can interrupt on the next natural stop.
+                interrupt_requested = true;
             }
             continue;
         }
@@ -4662,6 +4616,23 @@ Process::RunPrivateStateThread (bool is_
                 m_clear_thread_plans_on_stop = false;
                 m_thread_list.DiscardThreadPlans();
             }
+
+            if (interrupt_requested)
+            {
+                if (StateIsStoppedState (internal_state, true))
+                {
+                    // We requested the interrupt, so mark this as such in the stop event so
+                    // clients can tell an interrupted process from a natural stop
+                    ProcessEventData::SetInterruptedInEvent (event_sp.get(), true);
+                    interrupt_requested = false;
+                }
+                else if (log)
+                {
+                    log->Printf("Process::%s interrupt_requested, but a non-stopped state '%s' received.",
+                            __FUNCTION__, StateAsCString(internal_state));
+                }
+            }
+
             HandlePrivateEvent (event_sp);
         }
 
@@ -5744,7 +5715,9 @@ Process::RunThreadPlan (ExecutionContext
                     {
                         // This is probably an overabundance of caution, I don't think I should ever get a stopped & restarted
                         // event here.  But if I do, the best thing is to Halt and then get out of here.
-                        Halt();
+                        const bool clear_thread_plans = false;
+                        const bool use_run_lock = false;
+                        Halt(clear_thread_plans, use_run_lock);
                     }
 
                     errors.Printf("Didn't get running event after initial resume, got %s instead.",
@@ -5838,7 +5811,9 @@ Process::RunThreadPlan (ExecutionContext
                     bool keep_going = false;
                     if (event_sp->GetType() == eBroadcastBitInterrupt)
                     {
-                        Halt();
+                        const bool clear_thread_plans = false;
+                        const bool use_run_lock = false;
+                        Halt(clear_thread_plans, use_run_lock);
                         return_value = eExpressionInterrupted;
                         errors.Printf ("Execution halted by user interrupt.");
                         if (log)
@@ -6015,7 +5990,9 @@ Process::RunThreadPlan (ExecutionContext
                     {
                         if (log)
                             log->Printf ("Process::RunThreadPlan(): Running Halt.");
-                        halt_error = Halt();
+                        const bool clear_thread_plans = false;
+                        const bool use_run_lock = false;
+                        Halt(clear_thread_plans, use_run_lock);
                     }
                     if (halt_error.Success())
                     {




More information about the lldb-commits mailing list