[Lldb-commits] [lldb] 9febd1e - Fix race condition when launching and attaching.

Thu Feb 17 13:18:57 PST 2022

Author: Greg Clayton
Date: 2022-02-17T13:18:49-08:00
New Revision: 9febd1e573fb8b3d1de5844b7bfd33eb998f0106

URL: https://github.com/llvm/llvm-project/commit/9febd1e573fb8b3d1de5844b7bfd33eb998f0106
DIFF: https://github.com/llvm/llvm-project/commit/9febd1e573fb8b3d1de5844b7bfd33eb998f0106.diff

LOG: Fix race condition when launching and attaching.

We discovered that when using "launchCommands" or "attachCommands" that there was an issue where these commands were not being run synchronously. There were further problems in this case where we would get thread events for the process that was just launched or attached before the IDE was ready, which is after "configurationDone" was sent to lldb-vscode.

This fix introduces the ability to wait for the process to stop after the run or attach to ensure that we have a stopped process at the entry point that is ready for the debug session to proceed. This also allows us to run the normal launch or attach without needing to play with the async flag the debugger. We spin up the thread that listens for process events before we start the launch or attach, but we stop the first eStateStopped (with stop ID of zero) event from being delivered through the DAP protocol because the "configurationDone" request handler will deliver it manually as the IDE expects a stop after configuration done. The request_configurationDone will also only deliver the stop packet if the "stopOnEntry" is False in the launch configuration.

Also added a new "timeout" to the launch and attach launch configuration arguments that can be set and defaults to 30 seconds. Since we now poll to detect when the process is stopped, we need a timeout that can be changed in case certain workflows take longer that 30 seconds to attach. If the process is not stopped by the timeout, an error will be retured for the launch or attach.

Added a flag to the vscode.py protocol classes that detects and ensures that no "stopped" events are sent prior to "configurationDone" has been sent and will raise an error if it does happen.

This should make our launching and attaching more reliable and avoid some deadlocks that were being seen (https://reviews.llvm.org/D119548).

Differential Revision: https://reviews.llvm.org/D119797

Added: 
    

Modified: 
    lldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
    lldb/test/API/tools/lldb-vscode/launch/TestVSCode_launch.py
    lldb/tools/lldb-vscode/VSCode.cpp
    lldb/tools/lldb-vscode/VSCode.h
    lldb/tools/lldb-vscode/lldb-vscode.cpp
    lldb/tools/lldb-vscode/package.json

Removed: 
    


################################################################################
diff  --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
index 603b1545cd714..ae919fc2ed0c6 100644

--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
@@ -228,6 +228,9 @@ def handle_recv_packet(self, packet):
                 # 'stopped' event. We need to remember the thread stop
                 # reasons since the 'threads' command doesn't return
                 # that information.
+                if not self.configuration_done_sent:
+                    raise ValueError("'stopped' event received before "
+                                     "configuationDone packet was sent")
                 self._process_stopped()
                 tid = body['threadId']
                 self.thread_stop_reasons[tid] = body

diff  --git a/lldb/test/API/tools/lldb-vscode/launch/TestVSCode_launch.py b/lldb/test/API/tools/lldb-vscode/launch/TestVSCode_launch.py
index ff798364c9573..8c0000bdb1546 100644
--- a/lldb/test/API/tools/lldb-vscode/launch/TestVSCode_launch.py
+++ b/lldb/test/API/tools/lldb-vscode/launch/TestVSCode_launch.py
@@ -374,7 +374,7 @@ def test_commands(self):
     @skipIfRemote
     def test_extra_launch_commands(self):
         '''
-            Tests the "luanchCommands" with extra launching settings
+            Tests the "launchCommands" with extra launching settings
         '''
         self.build_and_create_debug_adaptor()
         program = self.getBuildArtifact("a.out")

diff  --git a/lldb/tools/lldb-vscode/VSCode.cpp b/lldb/tools/lldb-vscode/VSCode.cpp
index 3209eea4a897f..a6fe7f840a566 100644
--- a/lldb/tools/lldb-vscode/VSCode.cpp
+++ b/lldb/tools/lldb-vscode/VSCode.cpp
@@ -528,6 +528,58 @@ void VSCode::RegisterRequestCallback(std::string request,
   request_handlers[request] = callback;
 }
 
+lldb::SBError VSCode::WaitForProcessToStop(uint32_t seconds) {
+  // Wait for the process hit a stopped state. When running a launch (with or
+  // without "launchCommands") or attach (with or without)= "attachCommands"),
+  // the calls might take some time to stop at the entry point since the command
+  // is asynchronous. So we need to sync up with the process and make sure it is
+  // stopped before we proceed to do anything else as we will soon be asked to
+  // set breakpoints and other things that require the process to be stopped.
+  // We must use polling because attach doesn't send a process state change
+  // event for the first stop, while launching does. Since both "attachCommands"
+  // and "launchCommands" could end up using any combination of LLDB commands,
+  // we must ensure we can also catch when the process stops, so we must poll
+  // the process to make sure we handle all cases.
+
+  lldb::SBError error;
+  lldb::SBProcess process = target.GetProcess();
+  if (!process.IsValid()) {
+    error.SetErrorString("invalid process");
+    return error;
+  }
+  auto timeout_time =
+      std::chrono::high_resolution_clock::now() + std::chrono::seconds(seconds);
+  while (std::chrono::high_resolution_clock::now() < timeout_time) {
+    const auto state = process.GetState();
+    switch (state) {
+      case lldb::eStateAttaching:
+      case lldb::eStateConnected:
+      case lldb::eStateInvalid:
+      case lldb::eStateLaunching:
+      case lldb::eStateRunning:
+      case lldb::eStateStepping:
+      case lldb::eStateSuspended:
+        break;
+      case lldb::eStateDetached:
+        error.SetErrorString("process detached during launch or attach");
+        return error;
+      case lldb::eStateExited:
+        error.SetErrorString("process exited during launch or attach");
+        return error;
+      case lldb::eStateUnloaded:
+        error.SetErrorString("process unloaded during launch or attach");
+        return error;
+      case lldb::eStateCrashed:
+      case lldb::eStateStopped:
+        return lldb::SBError(); // Success!
+    }
+    std::this_thread::sleep_for(std::chrono::microseconds(250));
+  }
+  error.SetErrorStringWithFormat("process failed to stop within %u seconds",
+                                 seconds);
+  return error;
+}
+
 void Variables::Clear() {
   locals.Clear();
   globals.Clear();

diff  --git a/lldb/tools/lldb-vscode/VSCode.h b/lldb/tools/lldb-vscode/VSCode.h
index 602cf758a9a17..bc868760eb830 100644
--- a/lldb/tools/lldb-vscode/VSCode.h
+++ b/lldb/tools/lldb-vscode/VSCode.h
@@ -243,6 +243,19 @@ struct VSCode {
   /// Debuggee will continue from stopped state.
   void WillContinue() { variables.Clear(); }
 
+  /// Poll the process to wait for it to reach the eStateStopped state.
+  ///
+  /// We need to ensure the process is stopped and ready to resume before we
+  /// continue with the launch or attach. This is needed since we no longer play
+  /// with the synchronous mode in the debugger for launching (with or without
+  /// "launchCommands") or attaching (with or without "attachCommands").
+  ///
+  /// \param[in] seconds
+  ///   The number of seconds to poll the process to wait until it is stopped.
+  ///
+  /// \return Error if waiting for the process fails, no error if succeeds.
+  lldb::SBError WaitForProcessToStop(uint32_t seconds);
+
 private:
   // Send the JSON in "json_str" to the "out" stream. Correctly send the
   // "Content-Length:" field followed by the length, followed by the raw

diff  --git a/lldb/tools/lldb-vscode/lldb-vscode.cpp b/lldb/tools/lldb-vscode/lldb-vscode.cpp
index 97ec4b578cf7c..734b23afc9b28 100644
--- a/lldb/tools/lldb-vscode/lldb-vscode.cpp
+++ b/lldb/tools/lldb-vscode/lldb-vscode.cpp
@@ -449,10 +449,18 @@ void EventThreadFunction() {
           case lldb::eStateSuspended:
             break;
           case lldb::eStateStopped:
-            // Only report a stopped event if the process was not restarted.
-            if (!lldb::SBProcess::GetRestartedFromEvent(event)) {
-              SendStdOutStdErr(process);
-              SendThreadStoppedEvent();
+            // Now that we don't mess with the async setting in the debugger
+            // when launching or attaching we will get the first process stop
+            // event which we do not want to send an event for. This is because
+            // we either manually deliver the event in by calling the
+            // SendThreadStoppedEvent() from request_configuarationDone() if we
+            // want to stop on entry, or we resume from that function.
+            if (process.GetStopID() > 1) {
+              // Only report a stopped event if the process was not restarted.
+              if (!lldb::SBProcess::GetRestartedFromEvent(event)) {
+                SendStdOutStdErr(process);
+                SendThreadStoppedEvent();
+              }
             }
             break;
           case lldb::eStateRunning:
@@ -600,6 +608,7 @@ void request_attach(const llvm::json::Object &request) {
   g_vsc.terminate_commands = GetStrings(arguments, "terminateCommands");
   auto attachCommands = GetStrings(arguments, "attachCommands");
   llvm::StringRef core_file = GetString(arguments, "coreFile");
+  const uint64_t timeout_seconds = GetUnsigned(arguments, "timeout", 30);
   g_vsc.stop_at_entry =
       core_file.empty() ? GetBoolean(arguments, "stopOnEntry", false) : true;
   std::vector<std::string> postRunCommands =
@@ -640,15 +649,10 @@ void request_attach(const llvm::json::Object &request) {
   }
   if (attachCommands.empty()) {
     // No "attachCommands", just attach normally.
-    // Disable async events so the attach will be successful when we return from
-    // the launch call and the launch will happen synchronously
-    g_vsc.debugger.SetAsync(false);
     if (core_file.empty())
       g_vsc.target.Attach(attach_info, error);
     else
       g_vsc.target.LoadCore(core_file.data(), error);
-    // Reenable async events
-    g_vsc.debugger.SetAsync(true);
   } else {
     // We have "attachCommands" that are a set of commands that are expected
     // to execute the commands after which a process should be created. If there
@@ -658,6 +662,9 @@ void request_attach(const llvm::json::Object &request) {
     // selected target after these commands are run.
     g_vsc.target = g_vsc.debugger.GetSelectedTarget();
   }
+  // Make sure the process is attached and stopped before proceeding.
+  if (error.Success())
+    error = g_vsc.WaitForProcessToStop(timeout_seconds);
 
   if (error.Success() && core_file.empty()) {
     auto attached_pid = g_vsc.target.GetProcess().GetProcessID();
@@ -1652,6 +1659,7 @@ void request_launch(const llvm::json::Object &request) {
       GetStrings(arguments, "postRunCommands");
   g_vsc.stop_at_entry = GetBoolean(arguments, "stopOnEntry", false);
   const llvm::StringRef debuggerRoot = GetString(arguments, "debuggerRoot");
+  const uint64_t timeout_seconds = GetUnsigned(arguments, "timeout", 30);
 
   // This is a hack for loading DWARF in .o files on Mac where the .o files
   // in the debug map of the main executable have relative paths which require
@@ -1716,17 +1724,17 @@ void request_launch(const llvm::json::Object &request) {
     if (llvm::Error err = request_runInTerminal(request))
       error.SetErrorString(llvm::toString(std::move(err)).c_str());
   } else if (launchCommands.empty()) {
-    // Disable async events so the launch will be successful when we return from
-    // the launch call and the launch will happen synchronously
-    g_vsc.debugger.SetAsync(false);
     g_vsc.target.Launch(launch_info, error);
-    g_vsc.debugger.SetAsync(true);
   } else {
     g_vsc.RunLLDBCommands("Running launchCommands:", launchCommands);
     // The custom commands might have created a new target so we should use the
     // selected target after these commands are run.
     g_vsc.target = g_vsc.debugger.GetSelectedTarget();
   }
+  // Make sure the process is launched and stopped at the entry point before
+  // proceeding.
+  if (error.Success())
+    error = g_vsc.WaitForProcessToStop(timeout_seconds);
 
   if (error.Fail()) {
     response["success"] = llvm::json::Value(false);

diff  --git a/lldb/tools/lldb-vscode/package.json b/lldb/tools/lldb-vscode/package.json
index a5c79911f6e9f..bedc8f16ea26e 100644
--- a/lldb/tools/lldb-vscode/package.json
+++ b/lldb/tools/lldb-vscode/package.json
@@ -215,7 +215,7 @@
 							},
 							"launchCommands": {
 								"type": "array",
-								"description": "Custom commands that are executed instead of launching a process. A target will be created with the launch arguments prior to executing these commands. The commands may optionally create a new target and must perform a launch. A valid process must exist after these commands complete or the \"launch\" will fail.",
+								"description": "Custom commands that are executed instead of launching a process. A target will be created with the launch arguments prior to executing these commands. The commands may optionally create a new target and must perform a launch. A valid process must exist after these commands complete or the \"launch\" will fail. Launch the process with \"process launch -s\" to make the process to at the entry point since lldb-vscode will auto resume if necessary.",
 								"default": []
 							},
 							"stopCommands": {
@@ -232,6 +232,10 @@
 								"type": "boolean",
 								"description": "Launch the program inside an integrated terminal in the IDE. Useful for debugging interactive command line programs",
 								"default": false
+							},
+							"timeout": {
+								"type": "string",
+								"description": "The time in seconds to wait for a program to stop at entry point when launching. Defaults to 30 seconds."
 							}
 						}
 					},
@@ -307,6 +311,10 @@
 							"coreFile": {
 								"type": "string",
 								"description": "Path to the core file to debug."
+							},
+							"timeout": {
+								"type": "string",
+								"description": "The time in seconds to wait for a program to stop when attaching. Defaults to 30 seconds."
 							}
 						}
 					}