[compiler-rt] [fuzzer][Fuchsia] Prevent deadlock from suspending threads (PR #154854)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 21 16:00:35 PDT 2025
https://github.com/PiJoules created https://github.com/llvm/llvm-project/pull/154854
Every once in a couple hundred runs of a downstream fuzzer test, we see a fuzzing test freeze while waiting for a thread to be suspended. The main thread is frozen because it's waiting to suspend either the alarm or rss thread which is stuck waiting for an exception they sent out to be handled. Specifically, both threads send out a synthetic `ZX_EXCP_THREAD_STARTING` exception to be handled by the crash handling thread which sets up an exception channel on the whole process with `ZX_EXCEPTION_CHANNEL_DEBUGGER`. This is the only channel type that listens to thread stop/start exceptions. Normally, the exception would be ignored and the alarm or rss thread would continue normally once the crash handling thread closes the read exception. However, the memory snapshot machinery can suspend this thread while its in the process of waiting for or handling a `ZX_EXCP_THREAD_STARTING` sent by either the rss or alarm thread. If this is suspended first, then we attempt to suspend either the alarm or rss thread while they're still waiting for the crash handling thread to handle its exception, we will freeze waiting for those threads to give the suspend signal, which they won't because they're blocked on waiting for the exception handler. This is the deadlock.
Until there's a way for the memory snapshot machinery to suspend the thread while it's stuck on an exception, then we can work around this in the meantime by just ensuring the alarm and rss threads start normally via signals on the initial startup path. I can assert locally the freezing doesn't occur after 6000 runs where prior we would see it every couple hundred runs. Note this type of issue can arise again if the fuzzing test launches any dangling threads that happen to not start yet. One of the recommendations for writing a fuzz test is that the test may launch threads, but they should be joined by the end of the test (https://llvm.org/docs/LibFuzzer.html#fuzz-target), so hopefully we won't see this type of bug rise frequently from fuzz tests. More broadly, this can also arise if any process launches its own debugger via `ZX_EXCEPTION_CHANNEL_DEBUGGER`, but I would think in practice this isn't very likely to happen.
More context in https://fxbug.dev/436923423.
>From bc090b99ad3216862e6c7935a372c7750fba70d2 Mon Sep 17 00:00:00 2001
From: Leonard Chan <leonardchan at google.com>
Date: Thu, 21 Aug 2025 22:31:55 +0000
Subject: [PATCH] [fuzzer][Fuchsia] Prevent deadlock from suspending threads
Every once in a couple hundred runs of a downstream fuzzer test, we see
a fuzzing test freeze while waiting for a thread to be suspended. The
main thread is frozen because it's waiting to suspend either the alarm
or rss thread which is stuck waiting for an exception they sent out to
be handled. Specifically, both threads send out a synthetic
`ZX_EXCP_THREAD_STARTING` exception to be handled by the crash handling
thread which sets up an exception channel on the whole process with
`ZX_EXCEPTION_CHANNEL_DEBUGGER`. This is the only channel type that
listens to thread stop/start exceptions. Normally, the exception would
be ignored and the alarm or rss thread would continue normally once the
crash handling thread closes the read exception. However, the memory
snapshot machinery can suspend this thread while its in the process of
waiting for or handling a `ZX_EXCP_THREAD_STARTING` sent by either the
rss or alarm thread. If this is suspended first, then we attempt to
suspend either the alarm or rss thread while they're still waiting for
the crash handling thread to handle its exception, we will freeze
waiting for those threads to give the suspend signal, which they won't
because they're blocked on waiting for the exception handler. This is
the deadlock.
Until there's a way for the memory snapshot machinery to suspend the
thread while it's stuck on an exception, then we can work around this in
the meantime by just ensuring the alarm and rss threads start normally
via signals on the initial startup path. I can assert locally the
freezing doesn't occur after 6000 runs where prior we would see it every
couple hundred runs. Note this type of issue can arise again if the
fuzzing test launches any dangling threads that happen to not start yet.
One of the recommendations for writing a fuzz test is that the test may
launch threads, but they should be joined by the end of the test
(https://llvm.org/docs/LibFuzzer.html#fuzz-target), so hopefully we
won't see this type of bug rise frequently from fuzz tests. More
broadly, this can also arise if any process launches its own debugger via
`ZX_EXCEPTION_CHANNEL_DEBUGGER`, but I would think in practice this
isn't very likely to happen.
More context in https://fxbug.dev/436923423.
---
compiler-rt/lib/fuzzer/FuzzerDriver.cpp | 4 ++
compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp | 60 +++++++++++++++++++-
2 files changed, 61 insertions(+), 3 deletions(-)
diff --git a/compiler-rt/lib/fuzzer/FuzzerDriver.cpp b/compiler-rt/lib/fuzzer/FuzzerDriver.cpp
index ad3a65aff80e2..b1dd908ec3663 100644
--- a/compiler-rt/lib/fuzzer/FuzzerDriver.cpp
+++ b/compiler-rt/lib/fuzzer/FuzzerDriver.cpp
@@ -306,6 +306,9 @@ static int RunInMultipleProcesses(const std::vector<std::string> &Args,
return HasErrors ? 1 : 0;
}
+// Fuchsia needs to do some book checking before starting the RssThread,
+// so it has its own implementation.
+#if !LIBFUZZER_FUCHSIA
static void RssThread(Fuzzer *F, size_t RssLimitMb) {
while (true) {
SleepSeconds(1);
@@ -321,6 +324,7 @@ static void StartRssThread(Fuzzer *F, size_t RssLimitMb) {
std::thread T(RssThread, F, RssLimitMb);
T.detach();
}
+#endif
int RunOneTest(Fuzzer *F, const char *InputFilePath, size_t MaxLen) {
Unit U = FileToVector(InputFilePath);
diff --git a/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp b/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
index 7f065c79e717c..3c3807b7bd328 100644
--- a/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
+++ b/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
@@ -68,6 +68,9 @@ void ExitOnErr(zx_status_t Status, const char *Syscall) {
}
void AlarmHandler(int Seconds) {
+ // Signal the alarm thread started.
+ ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
+ "_zx_object_signal AlarmHandler");
while (true) {
SleepSeconds(Seconds);
Fuzzer::StaticAlarmCallback();
@@ -282,6 +285,7 @@ void CrashHandler() {
Self, ZX_EXCEPTION_CHANNEL_DEBUGGER, &Channel.Handle),
"_zx_task_create_exception_channel");
+ // Signal the crash thread started.
ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
"_zx_object_signal");
@@ -385,10 +389,49 @@ void StopSignalHandler() {
_zx_handle_close(SignalHandlerEvent);
}
+void RssThread(Fuzzer *F, size_t RssLimitMb) {
+ // Signal the rss thread started.
+ //
+ // We must wait for this thread to start because we could accidentally suspend
+ // it while the crash handler is attempting to handle the
+ // ZX_EXCP_THREAD_STARTING exception. If the crash handler is suspended by the
+ // lsan machinery, then there's no way for this thread to indicate it's
+ // suspended because it's blocked on waiting for the exception to be handled.
+ ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
+ "_zx_object_signal rss");
+ while (true) {
+ SleepSeconds(1);
+ size_t Peak = GetPeakRSSMb();
+ if (Peak > RssLimitMb)
+ F->RssLimitCallback();
+ }
+}
+
} // namespace
+void StartRssThread(Fuzzer *F, size_t RssLimitMb) {
+ // Set up the crash handler and wait until it is ready before proceeding.
+ assert(SignalHandlerEvent == ZX_HANDLE_INVALID);
+ ExitOnErr(_zx_event_create(0, &SignalHandlerEvent), "_zx_event_create");
+
+ if (!RssLimitMb)
+ return;
+ std::thread T(RssThread, F, RssLimitMb);
+ T.detach();
+
+ // Wait for the rss thread to start.
+ ExitOnErr(_zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
+ ZX_TIME_INFINITE, nullptr),
+ "_zx_object_wait_one rss");
+ ExitOnErr(_zx_object_signal(SignalHandlerEvent, ZX_USER_SIGNAL_0, 0),
+ "_zx_object_signal rss clear");
+}
+
// Platform specific functions.
void SetSignalHandler(const FuzzingOptions &Options) {
+ assert(SignalHandlerEvent != ZX_HANDLE_INVALID &&
+ "This should've been setup by StartRssThread.");
+
// Make sure information from libFuzzer and the sanitizers are easy to
// reassemble. `__sanitizer_log_write` has the added benefit of ensuring the
// DSO map is always available for the symbolizer.
@@ -404,6 +447,20 @@ void SetSignalHandler(const FuzzingOptions &Options) {
if (Options.HandleAlrm && Options.UnitTimeoutSec > 0) {
std::thread T(AlarmHandler, Options.UnitTimeoutSec / 2 + 1);
T.detach();
+
+ // Wait for the alarm thread to start.
+ //
+ // We must wait for this thread to start because we could accidentally
+ // suspend it while the crash handler is attempting to handle the
+ // ZX_EXCP_THREAD_STARTING exception. If the crash handler is suspended by
+ // the lsan machinery, then there's no way for this thread to indicate it's
+ // suspended because it's blocked on waiting for the exception to be
+ // handled.
+ ExitOnErr(_zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
+ ZX_TIME_INFINITE, nullptr),
+ "_zx_object_wait_one alarm");
+ ExitOnErr(_zx_object_signal(SignalHandlerEvent, ZX_USER_SIGNAL_0, 0),
+ "_zx_object_signal alarm clear");
}
// Options.HandleInt and Options.HandleTerm are not supported on Fuchsia
@@ -413,9 +470,6 @@ void SetSignalHandler(const FuzzingOptions &Options) {
!Options.HandleFpe && !Options.HandleAbrt && !Options.HandleTrap)
return;
- // Set up the crash handler and wait until it is ready before proceeding.
- ExitOnErr(_zx_event_create(0, &SignalHandlerEvent), "_zx_event_create");
-
SignalHandler = std::thread(CrashHandler);
zx_status_t Status = _zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
ZX_TIME_INFINITE, nullptr);
More information about the llvm-commits
mailing list