[Lldb-commits] [lldb] 769d704 - Recommit "[lldb/Core] Fix a race in the Communication class"

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Thu Apr 9 04:39:09 PDT 2020


Author: Pavel Labath
Date: 2020-04-09T13:39:00+02:00
New Revision: 769d7041cc1457dc2c5bcd040f24d530af48b76b

URL: https://github.com/llvm/llvm-project/commit/769d7041cc1457dc2c5bcd040f24d530af48b76b
DIFF: https://github.com/llvm/llvm-project/commit/769d7041cc1457dc2c5bcd040f24d530af48b76b.diff

LOG: Recommit "[lldb/Core] Fix a race in the Communication class"

The synchronization logic in the previous had a subtle bug. Moving of
the "m_read_thread_did_exit = true" into the critical section made it
possible for some threads calling SynchronizeWithReadThread call to get
stuck. This could happen if there were already past the point where they
checked this variable. In that case, they would block on waiting for the
eBroadcastBitNoMorePendingInput event, which would never come as the
read thread was blocked on getting the synchronization mutex.

The new version moves that line out of the critical section and before
the sending of the eBroadcastBitNoMorePendingInput event, and also adds
some comments to explain why the things need to be in this sequence:
- m_read_thread_did_exit = true: prevents new threads for waiting on
  events
- eBroadcastBitNoMorePendingInput: unblock any current thread waiting
  for the event
- Disconnect(): close the connection. This is the only bit that needs to
  be in the critical section, and this is to ensure that we don't close
  the connection while the synchronizing thread is mucking with it.

Original commit message follows:

Communication::SynchronizeWithReadThread is called whenever a process
stops to ensure that we process all of its stdout before we report the
stop. If the process exits, we first call this method, and then close
the connection.

However, when the child process exits, the thread reading its stdout
will usually (but not always) read an EOF because the other end of the
pty has been closed. In response to an EOF, the Communication read
thread closes it's end of the connection too.

This can result in a race where the read thread is closing the
connection while the synchronizing thread is attempting to get its
attention via Connection::InterruptRead.

The fix is to hold the synchronization mutex while closing the
connection.

I've found this issue while tracking down a rare flake in some of the
vscode tests. I am not sure this is the cause of those failures (as I
would have expected this issue to manifest itself differently), but it
is an issue nonetheless.

The attached test demonstrates the steps needed to reproduce the race.
It will fail under tsan without this patch.

Reviewers: clayborg, JDevlieghere

Subscribers: mgorny, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D77295

Added: 
    lldb/unittests/Core/CommunicationTest.cpp

Modified: 
    lldb/source/Core/Communication.cpp
    lldb/unittests/Core/CMakeLists.txt

Removed: 
    


################################################################################
diff  --git a/lldb/source/Core/Communication.cpp b/lldb/source/Core/Communication.cpp
index f4163847e4bb..c3e421191b01 100644
--- a/lldb/source/Core/Communication.cpp
+++ b/lldb/source/Core/Communication.cpp
@@ -315,16 +315,12 @@ lldb::thread_result_t Communication::ReadThread(lldb::thread_arg_t p) {
   Status error;
   ConnectionStatus status = eConnectionStatusSuccess;
   bool done = false;
+  bool disconnect = false;
   while (!done && comm->m_read_thread_enabled) {
     size_t bytes_read = comm->ReadFromConnection(
         buf, sizeof(buf), std::chrono::seconds(5), status, &error);
-    if (bytes_read > 0)
+    if (bytes_read > 0 || status == eConnectionStatusEndOfFile)
       comm->AppendBytesToCache(buf, bytes_read, true, status);
-    else if ((bytes_read == 0) && status == eConnectionStatusEndOfFile) {
-      if (comm->GetCloseOnEOF())
-        comm->Disconnect();
-      comm->AppendBytesToCache(buf, bytes_read, true, status);
-    }
 
     switch (status) {
     case eConnectionStatusSuccess:
@@ -332,11 +328,12 @@ lldb::thread_result_t Communication::ReadThread(lldb::thread_arg_t p) {
 
     case eConnectionStatusEndOfFile:
       done = true;
+      disconnect = comm->GetCloseOnEOF();
       break;
     case eConnectionStatusError: // Check GetError() for details
       if (error.GetType() == eErrorTypePOSIX && error.GetError() == EIO) {
         // EIO on a pipe is usually caused by remote shutdown
-        comm->Disconnect();
+        disconnect = comm->GetCloseOnEOF();
         done = true;
       }
       if (error.Fail())
@@ -365,9 +362,15 @@ lldb::thread_result_t Communication::ReadThread(lldb::thread_arg_t p) {
   if (log)
     LLDB_LOGF(log, "%p Communication::ReadThread () thread exiting...", p);
 
-  comm->m_read_thread_did_exit = true;
-  // Let clients know that this thread is exiting
   comm->BroadcastEvent(eBroadcastBitNoMorePendingInput);
+  {
+    std::lock_guard<std::mutex> guard(comm->m_synchronize_mutex);
+    comm->m_read_thread_did_exit = true;
+    if (disconnect)
+      comm->Disconnect();
+  }
+
+  // Let clients know that this thread is exiting
   comm->BroadcastEvent(eBroadcastBitReadThreadDidExit);
   return {};
 }

diff  --git a/lldb/unittests/Core/CMakeLists.txt b/lldb/unittests/Core/CMakeLists.txt
index 688a39d9a10f..6393c6fe38c2 100644
--- a/lldb/unittests/Core/CMakeLists.txt
+++ b/lldb/unittests/Core/CMakeLists.txt
@@ -1,4 +1,5 @@
 add_lldb_unittest(LLDBCoreTests
+  CommunicationTest.cpp
   MangledTest.cpp
   RichManglingContextTest.cpp
   StreamCallbackTest.cpp

diff  --git a/lldb/unittests/Core/CommunicationTest.cpp b/lldb/unittests/Core/CommunicationTest.cpp
new file mode 100644
index 000000000000..6ad0bc720d93
--- /dev/null
+++ b/lldb/unittests/Core/CommunicationTest.cpp
@@ -0,0 +1,35 @@
+//===-- CommunicationTest.cpp ---------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "lldb/Core/Communication.h"
+#include "lldb/Host/ConnectionFileDescriptor.h"
+#include "lldb/Host/Pipe.h"
+#include "llvm/Testing/Support/Error.h"
+#include "gtest/gtest.h"
+
+using namespace lldb_private;
+
+TEST(CommunicationTest, SynchronizeWhileClosing) {
+  // Set up a communication object reading from a pipe.
+  Pipe pipe;
+  ASSERT_THAT_ERROR(pipe.CreateNew(/*child_process_inherit=*/false).ToError(),
+                    llvm::Succeeded());
+
+  Communication comm("test");
+  comm.SetConnection(std::make_unique<ConnectionFileDescriptor>(
+      pipe.ReleaseReadFileDescriptor(), /*owns_fd=*/true));
+  comm.SetCloseOnEOF(true);
+  ASSERT_TRUE(comm.StartReadThread());
+
+  // Ensure that we can safely synchronize with the read thread while it is
+  // closing the read end (in response to us closing the write end).
+  pipe.CloseWriteFileDescriptor();
+  comm.SynchronizeWithReadThread();
+
+  ASSERT_TRUE(comm.StopReadThread());
+}


        


More information about the lldb-commits mailing list