[clang] [compiler-rt] [llvm] [tsan] Add simulation to TSAN (PR #183200)

Chris Cotter via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 15 08:21:01 PDT 2026


https://github.com/ccotter updated https://github.com/llvm/llvm-project/pull/183200

>From 9b9f5a3d85c7fa7866af8bbfa87a9dd122974349 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Tue, 24 Feb 2026 17:13:26 +0000
Subject: [PATCH 01/13] [tsan] Add simulation to TSAN

This change adds simulation support to TSAN, allowing more complete
exploration of thread interleavings. This feature is the same idea
in https://github.com/dvyukov/relacy, except implemented on top of TSAN
using real threads and TSAN's race detection logic.

TSAN exposes simulation via a new public API, __tsan_simulate. If
simulate_scheduler=random is specified in TSAN_OPTIONS, calls to
__tsan_simulate will execute the provided callback many times. The
simulation only allows one thread to actually execute at a time. At each
"scheduler point" (atomic op, mutex/cv op, thread lifecycle op), the
scheduler will check if another thread should be run. If so, then the
running thread parks itself and wakes the newly selected thread.

The intended use case for this feature is small unit tests for
atomic or thread safe data structures. In the future __tsan_simulate can
be integrated into test frameworks, allowing automatic simulation
support when compiled and linked with TSAN.

Simulation mode explicitly disallows unsupported APIs, like sleep, which
do not make sense in simulation mode. In the future, more APIs may be
added to the supported list.

This change adds an initial minimum viable product, and does not support
 - read/write mutexes
 - alternate scheduling algorithms, like full search, or other
   randomized distributions
 - std::atomic::wait/notify_* do not work, as they generally rely on
   OS specific syscalls like futex which are not visible to TSAN
 - pthread timed APIs like pthread_mutex_timedlock
---
 clang/docs/ThreadSanitizer.rst                | 275 +++++++
 clang/include/clang/Driver/SanitizerArgs.h    |   2 +
 clang/include/clang/Options/Options.td        |   4 +
 clang/lib/Driver/SanitizerArgs.cpp            |  27 +
 clang/lib/Driver/ToolChains/Gnu.cpp           |   9 +
 compiler-rt/lib/tsan/rtl/CMakeLists.txt       |   2 +
 compiler-rt/lib/tsan/rtl/tsan.syms.extra      |   1 +
 compiler-rt/lib/tsan/rtl/tsan_flags.inc       |  23 +
 compiler-rt/lib/tsan/rtl/tsan_interceptors.h  |   9 +
 .../lib/tsan/rtl/tsan_interceptors_posix.cpp  |  73 +-
 compiler-rt/lib/tsan/rtl/tsan_interface.cpp   |  45 +-
 compiler-rt/lib/tsan/rtl/tsan_interface.h     |  21 +
 .../lib/tsan/rtl/tsan_interface_atomic.cpp    |   4 +
 compiler-rt/lib/tsan/rtl/tsan_rtl.h           |   3 +
 compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp  |  13 +
 compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp  |   4 +
 compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp  |   6 +-
 compiler-rt/lib/tsan/rtl/tsan_simulate.cpp    | 741 ++++++++++++++++++
 compiler-rt/lib/tsan/rtl/tsan_simulate.h      | 154 ++++
 .../test/tsan/simulate_cond_signal.cpp        |  62 ++
 .../test/tsan/simulate_deadlock_condvar.cpp   |  47 ++
 .../simulate_deadlock_missing_broadcast.cpp   |  69 ++
 .../test/tsan/simulate_deadlock_simple.cpp    |  54 ++
 .../test/tsan/simulate_double_join.cpp        |  27 +
 compiler-rt/test/tsan/simulate_empty_test.cpp |  24 +
 .../test/tsan/simulate_immediate_exit.cpp     |  39 +
 .../test/tsan/simulate_invalid_iterations.cpp |  16 +
 .../tsan/simulate_invalid_start_iteration.cpp |  16 +
 compiler-rt/test/tsan/simulate_iterations.cpp |  55 ++
 .../test/tsan/simulate_join_many_threads.cpp  |  52 ++
 .../test/tsan/simulate_max_depth_hit.cpp      |  41 +
 .../test/tsan/simulate_multiple_mutexes.cpp   |  51 ++
 .../test/tsan/simulate_mutex_contention.cpp   |  41 +
 .../test/tsan/simulate_nested_create.cpp      |  65 ++
 ...ulate_non_atomic_interleaved_rare_race.cpp |  38 +
 .../test/tsan/simulate_probability.cpp        |  47 ++
 compiler-rt/test/tsan/simulate_race_basic.cpp |  35 +
 compiler-rt/test/tsan/simulate_rare_race.cpp  |  77 ++
 .../tsan/simulate_schedule_between_joins.cpp  |  28 +
 .../simulate_shared_mutex_unsupported.cpp     |  31 +
 compiler-rt/test/tsan/simulate_sleep.cpp      |  24 +
 .../test/tsan/simulate_sleep_unsupported.cpp  |  24 +
 compiler-rt/test/tsan/simulate_spinlock.cpp   |  34 +
 .../test/tsan/simulate_start_iteration.cpp    |  42 +
 .../test/tsan/simulate_stress_condvar.cpp     |  56 ++
 .../test/tsan/simulate_stress_mutex.cpp       |  44 ++
 .../test/tsan/simulate_thread_detection.cpp   |  43 +
 .../test/tsan/simulate_thread_local_dtor.cpp  |  47 ++
 .../tsan/simulate_timed_mutex_unsupported.cpp |  40 +
 .../tsan/simulate_unsupported_interceptor.cpp |  32 +
 50 files changed, 2704 insertions(+), 13 deletions(-)
 create mode 100644 compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
 create mode 100644 compiler-rt/lib/tsan/rtl/tsan_simulate.h
 create mode 100644 compiler-rt/test/tsan/simulate_cond_signal.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_deadlock_simple.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_double_join.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_empty_test.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_immediate_exit.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_invalid_iterations.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_iterations.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_join_many_threads.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_max_depth_hit.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_mutex_contention.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_nested_create.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_probability.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_race_basic.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_rare_race.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_sleep.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_sleep_unsupported.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_spinlock.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_start_iteration.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_stress_condvar.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_stress_mutex.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_thread_detection.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
 create mode 100644 compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp

diff --git a/clang/docs/ThreadSanitizer.rst b/clang/docs/ThreadSanitizer.rst
index ecbfbb6f170fa..987afffa2a622 100644
--- a/clang/docs/ThreadSanitizer.rst
+++ b/clang/docs/ThreadSanitizer.rst
@@ -327,6 +327,281 @@ Increase sampling frequency for mutex operations:
 
   $ TSAN_OPTIONS=enable_adaptive_delay=1:adaptive_delay_mutex_sample_rate=5 ./myapp
 
+Simulation Scheduler
+--------------------
+
+Overview
+~~~~~~~~
+
+The Simulation Scheduler is an optional ThreadSanitizer feature that enables
+systematic exploration of thread interleavings to expose data races that may be
+difficult to trigger in normal execution. Unlike standard ThreadSanitizer which
+detects races as they occur naturally during program execution, the simulation
+scheduler takes control of thread scheduling to deliberately explore different
+execution orderings.
+
+Simulation is particularly useful for:
+
+* Testing concurrent data structure or algorithms during development to ensure
+  correctness (for example, a lock free queue).
+* Finding races in rarely-executed interleavings that standard TSAN may miss
+* Reproducing specific race conditions deterministically
+
+Simulation is not useful for running full applications, and will likely not
+work in these scenarios. The code run in simulation should almost always be a
+small unit test exercising very specific functionality.
+
+When enabled via the ``__tsan_simulate()`` API, the simulation scheduler runs
+the program's concurrent code multiple times (iterations), choosing different
+thread interleavings in each iteration. The scheduler injects context switches
+at synchronization points (atomic operations, mutex operations, thread lifecycle
+events) to maximize coverage of possible execution orderings. If a data race is
+detected, the simulation stops and reports which iteration exposed the race,
+allowing that specific interleaving to be reproduced.
+
+Usage
+~~~~~
+
+To use simulation, wrap the concurrent code you want to test in a callback
+function and invoke it through the ``__tsan_simulate()`` API:
+
+.. code-block:: c
+
+    extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+    void test_concurrent_code(void *arg) {
+      // Create threads, run concurrent operations
+      pthread_t t1, t2;
+      pthread_create(&t1, NULL, thread_func, NULL);
+      pthread_create(&t2, NULL, thread_func, NULL);
+      pthread_join(t1, NULL);
+      pthread_join(t2, NULL);
+    }
+
+    int main() {
+      return __tsan_simulate(test_concurrent_code, NULL);
+    }
+
+Then compile with ThreadSanitizer and enable the simulation scheduler:
+
+.. code-block:: console
+
+  $ clang -fsanitize=thread -g -O1 mytest.c
+  $ TSAN_OPTIONS=simulate_scheduler=random ./a.out
+  ThreadSanitizer: simulation starting (iterations 0..999, max_depth=10000, scheduler=random)
+
+Automatic Main Wrapping
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+For convenience, the ``-fsanitize-thread-simulate-main`` compiler flag
+automatically wraps ``main()`` to call ``__tsan_simulate()``, eliminating the
+need to manually modify code:
+
+.. code-block:: c
+
+    // No need to call __tsan_simulate() manually
+    void *thread_func(void *arg) { /* ... */ }
+
+    int main() {
+      // This entire main() runs under simulation automatically
+      pthread_t t1, t2;
+      pthread_create(&t1, NULL, thread_func, NULL);
+      pthread_create(&t2, NULL, thread_func, NULL);
+      pthread_join(t1, NULL);
+      pthread_join(t2, NULL);
+      return 0;
+    }
+
+Compile and run:
+
+.. code-block:: console
+
+  $ clang -fsanitize=thread -fsanitize-thread-simulate-main -g -O1 mytest.c
+  $ TSAN_OPTIONS=simulate_scheduler=random ./a.out
+  ThreadSanitizer: simulation starting (iterations 0..999, max_depth=10000, scheduler=random)
+
+**Platform Support**: This flag requires GNU ld linker support for ``--wrap=main``
+and is currently only supported on Linux. Do not manually specify ``-Wl,--wrap=main``
+when using this flag, as the compiler handles the wrapping automatically.
+
+Configuration Options
+~~~~~~~~~~~~~~~~~~~~~
+
+.. list-table:: Simulation Scheduler Options
+   :name: simulation-scheduler-options-table
+   :header-rows: 1
+   :widths: 35 10 15 40
+
+   * - Flag
+     - Type
+     - Default
+     - Description
+   * - ``simulate_scheduler``
+     - string
+     - ""
+     - Scheduler algorithm for simulation. Supported values: ``"random"`` for
+       random scheduling decisions. Empty string (default) means simulation is
+       disabled. Must be set to enable simulation.
+   * - ``simulate_iterations``
+     - int
+     - 1000
+     - Number of iterations to run. Each iteration explores a different thread
+       interleaving. More iterations increase the likelihood of finding races but
+       take longer to complete.
+   * - ``simulate_start_iteration``
+     - int
+     - 0
+     - Starting iteration number. Useful for reproducing specific iteration
+       failures. Set this to the iteration number reported when a race was found
+       to reproduce that exact interleaving.
+   * - ``simulate_max_depth``
+     - int
+     - 10000
+     - Maximum number of scheduling decisions per iteration. If exceeded, the
+       iteration is aborted and simulation returns an error. Prevents infinite
+       loops or excessive scheduling overhead.
+   * - ``simulate_schedule_probability``
+     - int
+     - 100
+     - Probability (0-100%) of performing a context switch at each scheduling
+       point. Lower values (e.g., 0) disable context switching, allowing threads
+       to run more sequentially. Useful for comparing simulation results against
+       sequential execution.
+   * - ``simulate_schedule_on_memory_access``
+     - bool
+     - false
+     - Insert scheduling points at every memory read/write during simulation for
+       maximum interleaving exploration. This can significantly increase overhead
+       but may expose additional races.
+   * - ``simulate_print_schedule_stacks``
+     - bool
+     - false
+     - Print stack trace at each scheduling point. Useful for debugging and
+       understanding exact interleavings, but generates significant output.
+
+Examples
+~~~~~~~~
+
+Basic race detection that standard TSAN rarely finds:
+
+.. code-block:: c
+
+    // Compile: clang -fsanitize=thread -g -O1 test.c
+    #include <pthread.h>
+    #include <stdatomic.h>
+
+    extern int __tsan_simulate(void (*callback)(void *), void *arg);
+
+    atomic_int d = 0;
+    int a = 0;  // Non-atomic - race target
+
+    void *thread_func(void *arg) {
+      atomic_fetch_add(&d, 1);
+      ++a;  // Data race!
+      atomic_fetch_add(&d, 1);
+      return NULL;
+    }
+
+    void test_callback(void *arg) {
+      pthread_t t1, t2;
+      pthread_create(&t1, NULL, thread_func, NULL);
+      pthread_create(&t2, NULL, thread_func, NULL);
+      pthread_join(t1, NULL);
+      pthread_join(t2, NULL);
+    }
+
+    int main() { return __tsan_simulate(test_callback, NULL); }
+
+Standard TSAN execution rarely detects this race. Running 100 times produces no
+output most of the time:
+
+.. code-block:: console
+
+  $ clang -fsanitize=thread -g -O1 test.c
+  $ for i in {1..100}; do ./a.out; done
+  (no output - race not detected)
+
+Run with simulation enabled:
+
+.. code-block:: console
+
+  $ TSAN_OPTIONS=simulate_scheduler=random:simulate_iterations=50 ./a.out
+  ThreadSanitizer: simulation starting (iterations 0..999, max_depth=10000, scheduler=random)
+  ==================
+  WARNING: ThreadSanitizer: data race
+    Write of size 4 at 0x... by thread T1:
+      #0 thread_func test.c:12
+
+    Previous write of size 4 at 0x... by thread T2:
+      #0 thread_func test.c:12
+  ==================
+  ThreadSanitizer: data race detected at iteration 4
+  ThreadSanitizer: to reproduce, set TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=4
+  ThreadSanitizer: simulation stopped due to race detection after 5 iterations
+
+To reproduce the specific iteration that found the race:
+
+.. code-block:: console
+
+  $ TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=4:simulate_iterations=1 ./a.out
+  ThreadSanitizer: simulation starting (iterations 4..4, max_depth=10000, scheduler=random)
+  ==================
+  WARNING: ThreadSanitizer: data race
+  ...
+
+Compare simulation results with sequential execution (no context switching):
+
+.. code-block:: console
+
+  $ TSAN_OPTIONS=simulate_scheduler=random:simulate_schedule_probability=0:simulate_iterations=100 ./a.out
+
+Deadlock detection
+~~~~~~~~~~~~~~~~~~
+
+Simulation detects when an actual deadlock occurs, i.e., no thread is runnable and the program
+will remain blocked forever. For example,
+
+.. code-block:: c
+
+    // Compile: clang -fsanitize=thread -g -O1 deadlock.c
+    #include <pthread.h>
+
+    extern int __tsan_simulate(void (*callback)(void *), void *arg);
+
+    pthread_mutex_t mutex;
+    pthread_cond_t condvar;
+
+    void *thread_func(void *arg) {
+      pthread_mutex_lock(&mutex);
+      // Wait on condition variable that will never be signaled
+      pthread_cond_wait(&condvar, &mutex);
+      pthread_mutex_unlock(&mutex);
+      return NULL;
+    }
+
+    void test_callback(void *arg) {
+      pthread_mutex_init(&mutex, NULL);
+      pthread_cond_init(&condvar, NULL);
+
+      pthread_t t1;
+      pthread_create(&t1, NULL, thread_func, NULL);
+      pthread_join(t1, NULL);
+
+      pthread_cond_destroy(&condvar);
+      pthread_mutex_destroy(&mutex);
+    }
+
+    int main() { return __tsan_simulate(test_callback, NULL); }
+
+Run with simulation:
+
+.. code-block:: console
+
+  $ TSAN_OPTIONS=simulate_scheduler=random:simulate_iterations=2 ./deadlock
+  ThreadSanitizer: simulation starting (iterations 0..1, max_depth=10000, scheduler=random)
+  ThreadSanitizer: deadlock detected at iteration 0 - all threads are blocked
+  ThreadSanitizer: to reproduce, set TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=0
+
 More Information
 ----------------
 `<https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual>`_
diff --git a/clang/include/clang/Driver/SanitizerArgs.h b/clang/include/clang/Driver/SanitizerArgs.h
index ed2eb6852b124..2b9089db3c5fd 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -67,6 +67,7 @@ class SanitizerArgs {
   bool TsanMemoryAccess = true;
   bool TsanFuncEntryExit = true;
   bool TsanAtomics = true;
+  bool TsanSimulateMain = false;
   bool MinimalRuntime = false;
   bool TrapLoop = false;
   bool TysanOutlineInstrumentation = true;
@@ -100,6 +101,7 @@ class SanitizerArgs {
   }
   bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
+  bool needsTsanSimulateMain() const { return TsanSimulateMain; }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
   bool needsLsanRt() const {
diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td
index 4ac812e92e2cb..b7e4a1491608d 100644
--- a/clang/include/clang/Options/Options.td
+++ b/clang/include/clang/Options/Options.td
@@ -2822,6 +2822,10 @@ def fno_sanitize_thread_atomics : Flag<["-"], "fno-sanitize-thread-atomics">,
                                   Group<f_clang_Group>,
                                   Visibility<[ClangOption, CLOption]>,
                                   HelpText<"Disable atomic operations instrumentation in ThreadSanitizer">;
+def fsanitize_thread_simulate_main
+    : Flag<["-"], "fsanitize-thread-simulate-main">,
+      Group<f_clang_Group>,
+      HelpText<"Wrap main() to run under ThreadSanitizer simulation mode">;
 def fsanitize_undefined_strip_path_components_EQ : Joined<["-"], "fsanitize-undefined-strip-path-components=">,
   Group<f_clang_Group>, MetaVarName<"<number>">,
   HelpText<"Strip (or keep only, if negative) a given number of path components "
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index 294c9ad2705dc..c2f38c952c4ee 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -850,6 +850,33 @@ SanitizerArgs::SanitizerArgs(const ToolChain &TC,
     TsanAtomics =
         Args.hasFlag(options::OPT_fsanitize_thread_atomics,
                      options::OPT_fno_sanitize_thread_atomics, TsanAtomics);
+    TsanSimulateMain = Args.hasArg(options::OPT_fsanitize_thread_simulate_main);
+
+    // -fsanitize-thread-simulate-main requires --wrap=main linker support,
+    // which is only available on Linux with GNU ld.
+    if (TsanSimulateMain && DiagnoseErrors && !TC.getTriple().isOSLinux()) {
+      D.Diag(diag::err_drv_unsupported_opt_for_target)
+          << "-fsanitize-thread-simulate-main" << TC.getTriple().str();
+      TsanSimulateMain = false;
+    }
+
+    // Check for conflicting -Wl,--wrap=main when using
+    // -fsanitize-thread-simulate-main
+    if (TsanSimulateMain && DiagnoseErrors) {
+      for (const Arg *A :
+           Args.filtered(options::OPT_Wl_COMMA, options::OPT_Xlinker)) {
+        for (StringRef Val : A->getValues()) {
+          if (Val == "--wrap=main" || Val == "-wrap=main") {
+            D.Diag(diag::err_drv_argument_not_allowed_with)
+                << "-fsanitize-thread-simulate-main"
+                << (A->getOption().matches(options::OPT_Wl_COMMA)
+                        ? "-Wl,--wrap=main"
+                        : "-Xlinker --wrap=main");
+            break;
+          }
+        }
+      }
+    }
   }
 
   if (AllAddedKinds & SanitizerKind::CFI) {
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp b/clang/lib/Driver/ToolChains/Gnu.cpp
index d8d537ec14b89..45d3ffa64ef90 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -20,6 +20,7 @@
 #include "clang/Driver/Compilation.h"
 #include "clang/Driver/Driver.h"
 #include "clang/Driver/MultilibBuilder.h"
+#include "clang/Driver/SanitizerArgs.h"
 #include "clang/Driver/Tool.h"
 #include "clang/Driver/ToolChain.h"
 #include "clang/Options/Options.h"
@@ -448,6 +449,14 @@ void tools::gnutools::Linker::ConstructJob(Compilation &C, const JobAction &JA,
 
   bool NeedsSanitizerDeps = addSanitizerRuntimes(ToolChain, Args, CmdArgs);
   bool NeedsXRayDeps = addXRayRuntime(ToolChain, Args, CmdArgs);
+
+  // Add --wrap=main for ThreadSanitizer simulation mode
+  if (NeedsSanitizerDeps) {
+    const SanitizerArgs &SanArgs = ToolChain.getSanitizerArgs(Args);
+    if (SanArgs.needsTsanRt() && SanArgs.needsTsanSimulateMain())
+      CmdArgs.push_back("--wrap=main");
+  }
+
   addLinkerCompressDebugSectionsOption(ToolChain, Args, CmdArgs);
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
 
diff --git a/compiler-rt/lib/tsan/rtl/CMakeLists.txt b/compiler-rt/lib/tsan/rtl/CMakeLists.txt
index 6f093500c8f61..b6227d02079a7 100644
--- a/compiler-rt/lib/tsan/rtl/CMakeLists.txt
+++ b/compiler-rt/lib/tsan/rtl/CMakeLists.txt
@@ -45,6 +45,7 @@ set(TSAN_SOURCES
   tsan_rtl_proc.cpp
   tsan_rtl_report.cpp
   tsan_rtl_thread.cpp
+  tsan_simulate.cpp
   tsan_stack_trace.cpp
   tsan_suppressions.cpp
   tsan_symbolize.cpp
@@ -101,6 +102,7 @@ set(TSAN_HEADERS
   tsan_report.h
   tsan_rtl.h
   tsan_shadow.h
+  tsan_simulate.h
   tsan_stack_trace.h
   tsan_suppressions.h
   tsan_symbolize.h
diff --git a/compiler-rt/lib/tsan/rtl/tsan.syms.extra b/compiler-rt/lib/tsan/rtl/tsan.syms.extra
index 03d17d21e74e8..bdb9a2ae07973 100644
--- a/compiler-rt/lib/tsan/rtl/tsan.syms.extra
+++ b/compiler-rt/lib/tsan/rtl/tsan.syms.extra
@@ -31,6 +31,7 @@ __tsan_create_fiber
 __tsan_destroy_fiber
 __tsan_switch_to_fiber
 __tsan_set_fiber_name
+__tsan_simulate
 __ubsan_*
 Annotate*
 WTFAnnotate*
diff --git a/compiler-rt/lib/tsan/rtl/tsan_flags.inc b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
index 68d4ba660debb..a250ca1bf21ff 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_flags.inc
+++ b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
@@ -119,3 +119,26 @@ TSAN_FLAG(const char*, adaptive_delay_max_atomic, "sleep_us=50",
 TSAN_FLAG(const char*, adaptive_delay_max_sync, "sleep_us=500",
           "Delay for sync operations: 'spin=N' (max N spins), 'yield', or "
           "'sleep_us=N' (max N>0 us sleep)")
+
+TSAN_FLAG(const char*, simulate_scheduler, "",
+          "Scheduler algorithm for __tsan_simulate. "
+          "Supported values: 'random'. Empty means simulation is disabled.")
+TSAN_FLAG(int, simulate_iterations, 1000,
+          "Number of iterations for __tsan_simulate.")
+TSAN_FLAG(int, simulate_start_iteration, 0,
+          "Starting iteration number for __tsan_simulate. Useful for "
+          "reproducing specific iteration failures.")
+TSAN_FLAG(int, simulate_max_depth, 10000,
+          "Maximum scheduling depth per iteration. If exceeded, the "
+          "simulation returns an error after the iteration completes")
+TSAN_FLAG(bool, simulate_schedule_on_memory_access, false,
+          "Insert scheduling points at every memory read/write during "
+          "simulation for maximum interleaving exploration.")
+TSAN_FLAG(int, simulate_schedule_probability, 100,
+          "Probability (0-100%) of actually performing a context switch at "
+          "each scheduling point. Lower values allow threads to complete more "
+          "operations before switching.")
+TSAN_FLAG(bool, simulate_print_schedule_stacks, false,
+          "Print stack trace at each simulation scheduling point. Useful for "
+          "understanding the exact interleavings, but it generates significant "
+          "output.")
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors.h b/compiler-rt/lib/tsan/rtl/tsan_interceptors.h
index f8cc8ff3b406f..9845a9a023a5b 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors.h
@@ -84,6 +84,15 @@ inline bool MustIgnoreInterceptor(ThreadState *thr) {
   if (MustIgnoreInterceptor(thr))            \
     return REAL(func)(__VA_ARGS__);
 
+// Mark an interceptor as unsupported during simulation. If simulation is
+// active, reports an error but continues with normal TSAN instrumentation.
+// The simulation will return an error at the end of the current iteration.
+#define SIMULATE_CHECK_UNSUPPORTED(func) \
+  if (UNLIKELY(SimulateIsActive())) {    \
+    SimulateReportUnsupported(#func);    \
+    return {};                           \
+  }
+
 #define SCOPED_TSAN_INTERCEPTOR_USER_CALLBACK_START() \
     si.DisableIgnores();
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 6ae871af11ea9..684c17b662fac 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -40,6 +40,7 @@
 #include "tsan_mman.h"
 #include "tsan_platform.h"
 #include "tsan_rtl.h"
+#include "tsan_simulate.h"
 #include "tsan_suppressions.h"
 
 using namespace __tsan;
@@ -93,6 +94,7 @@ extern "C" int pthread_key_create(unsigned *key, void (*destructor)(void* v));
 extern "C" int pthread_setspecific(unsigned key, const void *v);
 DECLARE_REAL(int, pthread_mutexattr_gettype, void *, void *)
 DECLARE_REAL(int, fflush, __sanitizer_FILE *fp)
+DECLARE_REAL(int, pthread_mutex_trylock, void* m)
 DECLARE_REAL_AND_INTERCEPTOR(void *, malloc, usize size)
 DECLARE_REAL_AND_INTERCEPTOR(void, free, void *ptr)
 extern "C" int pthread_equal(void *t1, void *t2);
@@ -381,6 +383,7 @@ struct BlockingCall {
 
 TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
   SCOPED_TSAN_INTERCEPTOR(sleep, sec);
+  SIMULATE_CHECK_UNSUPPORTED(sleep);
   unsigned res = BLOCK_REAL(sleep)(sec);
   AfterSleep(thr, pc);
   return res;
@@ -388,6 +391,7 @@ TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
 
 TSAN_INTERCEPTOR(int, usleep, long_t usec) {
   SCOPED_TSAN_INTERCEPTOR(usleep, usec);
+  SIMULATE_CHECK_UNSUPPORTED(usleep);
   int res = BLOCK_REAL(usleep)(usec);
   AfterSleep(thr, pc);
   return res;
@@ -395,6 +399,7 @@ TSAN_INTERCEPTOR(int, usleep, long_t usec) {
 
 TSAN_INTERCEPTOR(int, nanosleep, void *req, void *rem) {
   SCOPED_TSAN_INTERCEPTOR(nanosleep, req, rem);
+  SIMULATE_CHECK_UNSUPPORTED(nanosleep);
   int res = BLOCK_REAL(nanosleep)(req, rem);
   AfterSleep(thr, pc);
   return res;
@@ -1039,6 +1044,7 @@ struct ThreadParam {
   void* (*callback)(void *arg);
   void *param;
   Tid tid;
+  uptr pthread_handle;
   Semaphore created;
   Semaphore started;
 };
@@ -1064,12 +1070,14 @@ extern "C" void *__tsan_thread_start_func(void *arg) {
     Processor *proc = ProcCreate();
     ProcWire(proc, thr);
     ThreadStart(thr, p->tid, GetTid(), ThreadType::Regular);
+    SimulateThreadRegister(p->pthread_handle);
     p->started.Post();
   }
 
   AdaptiveDelay::BeforeChildThreadRuns();
+  SimulateBeforeChildThreadRuns();
 
-  void *res = callback(param);
+  void* res = callback(param);
   // Prevent the callback from being tail called,
   // it mixes up stack traces.
   volatile int foo = 42;
@@ -1120,6 +1128,7 @@ TSAN_INTERCEPTOR(int, pthread_create,
   if (res == 0) {
     p.tid = ThreadCreate(thr, pc, *(uptr *)th, IsStateDetached(detached));
     CHECK_NE(p.tid, kMainTid);
+    p.pthread_handle = *(uptr*)th;
     // Synchronization on p.tid serves two purposes:
     // 1. ThreadCreate must finish before the new thread starts.
     //    Otherwise the new thread can call pthread_detach, but the pthread_t
@@ -1133,6 +1142,7 @@ TSAN_INTERCEPTOR(int, pthread_create,
   if (attr == &myattr)
     pthread_attr_destroy(&myattr);
   AdaptiveDelay::AfterThreadCreation();
+  SimulateSchedule();
   return res;
 }
 
@@ -1156,7 +1166,13 @@ TSAN_INTERCEPTOR(int, pthread_join, void *th, void **ret) {
 #endif
   Tid tid = ThreadConsumeTid(thr, pc, (uptr)th);
   ThreadIgnoreBegin(thr, pc);
-  int res = BLOCK_REAL(pthread_join)(th, ret);
+  int res;
+  if (SimulateIsActive())
+    res = SimulateJoin(th, ret, [thr](void* th, void** ret) {
+      return BLOCK_REAL(pthread_join)(th, ret);
+    });
+  else
+    res = BLOCK_REAL(pthread_join)(th, ret);
   ThreadIgnoreEnd(thr);
   if (res == 0) {
     ThreadJoin(thr, pc, tid);
@@ -1214,6 +1230,7 @@ TSAN_INTERCEPTOR(int, pthread_tryjoin_np, void *th, void **ret) {
 TSAN_INTERCEPTOR(int, pthread_timedjoin_np, void *th, void **ret,
                  const struct timespec *abstime) {
   SCOPED_INTERCEPTOR_RAW(pthread_timedjoin_np, th, ret, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_timedjoin_np);
   Tid tid = ThreadConsumeTid(thr, pc, (uptr)th);
   ThreadIgnoreBegin(thr, pc);
   int res = BLOCK_REAL(pthread_timedjoin_np)(th, ret, abstime);
@@ -1308,17 +1325,20 @@ int cond_wait(ThreadState *thr, uptr pc, ScopedInterceptor *si, const Fn &fn,
   MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
   MutexUnlock(thr, pc, (uptr)m);
   int res = 0;
-  // This ensures that we handle mutex lock even in case of pthread_cancel.
-  // See test/tsan/cond_cancel.cpp.
-  {
+
+  if (SimulateIsActive()) {
+    res = SimulateCondWait(thr, pc, c, m);
+    if (res != 0 && res != errno_EOWNERDEAD)
+      return res;
+  } else {
     // Enable signal delivery while the thread is blocked.
     BlockingCall bc(thr);
     CondMutexUnlockCtx<Fn> arg = {si, thr, pc, m, c, fn};
     res = call_pthread_cancel_with_cleanup(
-        [](void *arg) -> int {
+        [](void* arg) -> int {
           return ((const CondMutexUnlockCtx<Fn> *)arg)->Cancel();
         },
-        [](void *arg) { ((const CondMutexUnlockCtx<Fn> *)arg)->Unlock(); },
+        [](void* arg) { ((const CondMutexUnlockCtx<Fn>*)arg)->Unlock(); },
         &arg);
   }
   if (res == errno_EOWNERDEAD) MutexRepair(thr, pc, (uptr)m);
@@ -1337,6 +1357,7 @@ INTERCEPTOR(int, pthread_cond_wait, void *c, void *m) {
 INTERCEPTOR(int, pthread_cond_timedwait, void *c, void *m, void *abstime) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_timedwait, cond, m, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_cond_timedwait);
   return cond_wait(
       thr, pc, &si,
       [=]() { return REAL(pthread_cond_timedwait)(cond, m, abstime); }, cond,
@@ -1348,6 +1369,7 @@ INTERCEPTOR(int, pthread_cond_clockwait, void *c, void *m,
             __sanitizer_clockid_t clock, void *abstime) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_clockwait, cond, m, clock, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_cond_clockwait);
   return cond_wait(
       thr, pc, &si,
       [=]() { return REAL(pthread_cond_clockwait)(cond, m, clock, abstime); },
@@ -1363,6 +1385,7 @@ INTERCEPTOR(int, pthread_cond_timedwait_relative_np, void *c, void *m,
             void *reltime) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_timedwait_relative_np, cond, m, reltime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_cond_timedwait_relative_np);
   return cond_wait(
       thr, pc, &si,
       [=]() {
@@ -1376,14 +1399,20 @@ INTERCEPTOR(int, pthread_cond_signal, void *c) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_signal, cond);
   MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
-  return REAL(pthread_cond_signal)(cond);
+  int res = REAL(pthread_cond_signal)(cond);
+  SimulateCondSignal((uptr)cond);
+  SimulateSchedule();
+  return res;
 }
 
 INTERCEPTOR(int, pthread_cond_broadcast, void *c) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_broadcast, cond);
   MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
-  return REAL(pthread_cond_broadcast)(cond);
+  int res = REAL(pthread_cond_broadcast)(cond);
+  SimulateCondBroadcast((uptr)cond);
+  SimulateSchedule();
+  return res;
 }
 
 INTERCEPTOR(int, pthread_cond_destroy, void *c) {
@@ -1429,7 +1458,17 @@ TSAN_INTERCEPTOR(int, pthread_mutex_lock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_mutex_lock, m);
   MutexPreLock(thr, pc, (uptr)m);
   AdaptiveDelay::SyncOp();
-  int res = BLOCK_REAL(pthread_mutex_lock)(m);
+  int res;
+  if (SimulateIsActive()) {
+    SimulateSchedule();
+    while (true) {
+      res = REAL(pthread_mutex_trylock)(m);
+      if (res != errno_EBUSY)
+        break;
+      SimulateMutexBlock((uptr)m);
+    }
+  } else
+    res = BLOCK_REAL(pthread_mutex_lock)(m);
   if (res == errno_EOWNERDEAD)
     MutexRepair(thr, pc, (uptr)m);
   if (res == 0 || res == errno_EOWNERDEAD)
@@ -1453,6 +1492,7 @@ TSAN_INTERCEPTOR(int, pthread_mutex_trylock, void *m) {
 #if !SANITIZER_APPLE
 TSAN_INTERCEPTOR(int, pthread_mutex_timedlock, void *m, void *abstime) {
   SCOPED_TSAN_INTERCEPTOR(pthread_mutex_timedlock, m, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_mutex_timedlock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_mutex_timedlock)(m, abstime);
   if (res == 0) {
@@ -1467,6 +1507,8 @@ TSAN_INTERCEPTOR(int, pthread_mutex_unlock, void *m) {
   MutexUnlock(thr, pc, (uptr)m);
   int res = REAL(pthread_mutex_unlock)(m);
   AdaptiveDelay::SyncOp();
+  SimulateMutexUnblock((uptr)m);
+  SimulateSchedule();
   if (res == errno_EINVAL)
     MutexInvalidAccess(thr, pc, (uptr)m);
   return res;
@@ -1540,6 +1582,7 @@ TSAN_INTERCEPTOR(int, pthread_spin_destroy, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_spin_lock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_spin_lock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_spin_lock);
   MutexPreLock(thr, pc, (uptr)m);
   AdaptiveDelay::SyncOp();
   int res = BLOCK_REAL(pthread_spin_lock)(m);
@@ -1551,6 +1594,7 @@ TSAN_INTERCEPTOR(int, pthread_spin_lock, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_spin_trylock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_spin_trylock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_spin_trylock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_spin_trylock)(m);
   if (res == 0) {
@@ -1561,6 +1605,7 @@ TSAN_INTERCEPTOR(int, pthread_spin_trylock, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_spin_unlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_spin_unlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_spin_unlock);
   MutexUnlock(thr, pc, (uptr)m);
   int res = REAL(pthread_spin_unlock)(m);
   AdaptiveDelay::SyncOp();
@@ -1588,6 +1633,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_destroy, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_rwlock_rdlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_rdlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_rdlock);
   MutexPreReadLock(thr, pc, (uptr)m);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_rwlock_rdlock)(m);
@@ -1599,6 +1645,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_rdlock, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_rwlock_tryrdlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_tryrdlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_tryrdlock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_rwlock_tryrdlock)(m);
   if (res == 0) {
@@ -1610,6 +1657,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_tryrdlock, void *m) {
 #if !SANITIZER_APPLE
 TSAN_INTERCEPTOR(int, pthread_rwlock_timedrdlock, void *m, void *abstime) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_timedrdlock, m, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_timedrdlock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_rwlock_timedrdlock)(m, abstime);
   if (res == 0) {
@@ -1621,6 +1669,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_timedrdlock, void *m, void *abstime) {
 
 TSAN_INTERCEPTOR(int, pthread_rwlock_wrlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_wrlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_wrlock);
   MutexPreLock(thr, pc, (uptr)m);
   AdaptiveDelay::SyncOp();
   int res = BLOCK_REAL(pthread_rwlock_wrlock)(m);
@@ -1632,6 +1681,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_wrlock, void *m) {
 
 TSAN_INTERCEPTOR(int, pthread_rwlock_trywrlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_trywrlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_trywrlock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_rwlock_trywrlock)(m);
   if (res == 0) {
@@ -1643,6 +1693,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_trywrlock, void *m) {
 #if !SANITIZER_APPLE
 TSAN_INTERCEPTOR(int, pthread_rwlock_timedwrlock, void *m, void *abstime) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_timedwrlock, m, abstime);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_timedwrlock);
   AdaptiveDelay::SyncOp();
   int res = REAL(pthread_rwlock_timedwrlock)(m, abstime);
   if (res == 0) {
@@ -1654,6 +1705,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_timedwrlock, void *m, void *abstime) {
 
 TSAN_INTERCEPTOR(int, pthread_rwlock_unlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_unlock, m);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_rwlock_unlock);
   MutexReadOrWriteUnlock(thr, pc, (uptr)m);
   int res = REAL(pthread_rwlock_unlock)(m);
   AdaptiveDelay::SyncOp();
@@ -1677,6 +1729,7 @@ TSAN_INTERCEPTOR(int, pthread_barrier_destroy, void *b) {
 
 TSAN_INTERCEPTOR(int, pthread_barrier_wait, void *b) {
   SCOPED_TSAN_INTERCEPTOR(pthread_barrier_wait, b);
+  SIMULATE_CHECK_UNSUPPORTED(pthread_barrier_wait);
   Release(thr, pc, (uptr)b);
   MemoryAccess(thr, pc, (uptr)b, 1, kAccessRead);
   int res = REAL(pthread_barrier_wait)(b);
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
index e6c4bf2e60a7b..d1f1510a34e62 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
@@ -11,10 +11,14 @@
 //===----------------------------------------------------------------------===//
 
 #include "tsan_interface.h"
-#include "tsan_interface_ann.h"
-#include "tsan_rtl.h"
+
 #include "sanitizer_common/sanitizer_internal_defs.h"
 #include "sanitizer_common/sanitizer_ptrauth.h"
+#include "tsan_interface_ann.h"
+#include "tsan_platform.h"
+#include "tsan_rtl.h"
+#include "tsan_shadow.h"
+#include "tsan_simulate.h"
 
 #define CALLERPC ((uptr)__builtin_return_address(0))
 
@@ -83,6 +87,43 @@ void __tsan_set_fiber_name(void *fiber, const char *name) {
 }
 }  // extern "C"
 
+int __tsan_simulate(void (*callback)(void* arg), void* arg) {
+  Initialize(cur_thread_init());
+  return SimulateRun(callback, arg);
+}
+
+#if SANITIZER_LINUX
+// Support for -fsanitize-thread-simulate-main linker wrapping.
+// The --wrap linker feature is only available on GNU LD (Linux), not on macOS.
+extern "C" SANITIZER_WEAK_ATTRIBUTE int __real_main(int argc, char** argv,
+                                                    char** envp);
+
+namespace {
+struct MainArgs {
+  int argc;
+  char** argv;
+  char** envp;
+  int exit_code;
+};
+
+static void wrapped_main_callback(void* arg) {
+  MainArgs* args = static_cast<MainArgs*>(arg);
+  args->exit_code = __real_main(args->argc, args->argv, args->envp);
+}
+}  // namespace
+
+extern "C" int __wrap_main(int argc, char** argv, char** envp) {
+  MainArgs args = {argc, argv, envp, 0};
+  int sim_result = __tsan_simulate(wrapped_main_callback, &args);
+  // If simulation succeeded (return code 0 or exit due to no threads spawned),
+  // return the exit code from main. Otherwise, return the simulation error
+  // code.
+  if (sim_result == 0)
+    return args.exit_code;
+  return sim_result;
+}
+#endif  // SANITIZER_LINUX
+
 void __tsan_acquire(void *addr) {
   Acquire(cur_thread(), CALLERPC, (uptr)addr);
 }
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface.h b/compiler-rt/lib/tsan/rtl/tsan_interface.h
index db94cf48f9c2d..7caf84b0897f2 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface.h
@@ -90,6 +90,27 @@ SANITIZER_INTERFACE_ATTRIBUTE void __tsan_ignore_thread_end();
 
 SANITIZER_INTERFACE_ATTRIBUTE void __tsan_on_thread_idle();
 
+// Run a test function under simulation, exploring thread interleavings.
+// The callback is invoked repeatedly (controlled by TSAN_OPTIONS flags
+// simulate_iterations, simulate_max_depth, simulate_scheduler).
+// The callback should create threads, exercise concurrent data structures,
+// and assert correctness. The simulator ensures exactly one thread runs at
+// a time and randomly varies the interleaving at each sync point.
+//
+// Returns:
+//   0 - Success (all iterations completed without errors)
+//  -1 - Failure (pre-existing threads, unsupported interceptor, max depth hit,
+//       or race detected)
+//
+// Note: Deadlock detection calls Die() and does not return.
+//
+// LIMITATIONS:
+// - No other threads must be running when __tsan_simulate is called.
+// - Only pthread_mutex, pthread_cond, pthread_create/join, and atomics
+//   are supported. Other pthread primitives will fail the simulation.
+SANITIZER_INTERFACE_ATTRIBUTE
+int __tsan_simulate(void (*callback)(void* arg), void* arg);
+
 SANITIZER_INTERFACE_ATTRIBUTE
 void *__tsan_external_register_tag(const char *object_type);
 SANITIZER_INTERFACE_ATTRIBUTE
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp
index 5c2461634d2d4..34241991ff8e6 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp
@@ -25,6 +25,7 @@
 #include "tsan_flags.h"
 #include "tsan_interface.h"
 #include "tsan_rtl.h"
+#include "tsan_simulate.h"
 
 using namespace __tsan;
 
@@ -534,6 +535,9 @@ ALWAYS_INLINE auto AtomicDelayImpl(morder mo, AddrType addr, Types... args) {
 template <class Op, class... Types>
 ALWAYS_INLINE auto AtomicImpl(morder mo, Types... args) {
   AtomicDelayImpl(mo, args...);
+#  if !SANITIZER_GO
+  SimulateSchedule();
+#  endif
   ThreadState *const thr = cur_thread();
   ProcessPendingSignals(thr);
   if (UNLIKELY(thr->ignore_sync || thr->ignore_interceptors))
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl.h b/compiler-rt/lib/tsan/rtl/tsan_rtl.h
index 3d1018accafc4..44f4b3957891d 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl.h
@@ -243,6 +243,9 @@ struct alignas(SANITIZER_CACHE_LINE_SIZE) ThreadState {
 
   AdaptiveDelayState adaptive_delay_state;
 
+  // Simulation thread index. -1 when not participating in simulation.
+  int sim_thread_idx = -1;
+
   explicit ThreadState(Tid tid);
 };
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
index b2e70475e0b73..a187c81a23570 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
@@ -12,6 +12,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "tsan_rtl.h"
+#include "tsan_simulate.h"
 
 namespace __tsan {
 
@@ -423,6 +424,10 @@ ALWAYS_INLINE USED void MemoryAccess(ThreadState* thr, uptr pc, uptr addr,
   // Swift symbolizer can be intercepted and deadlock without this
   if (thr->in_symbolizer)
     return;
+#endif
+#if !SANITIZER_GO
+  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
+    SimulateSchedule();
 #endif
   RawShadow* shadow_mem = MemToShadow(addr);
   UNUSED char memBuf[4][64];
@@ -462,6 +467,10 @@ ALWAYS_INLINE USED void MemoryAccess16(ThreadState* thr, uptr pc, uptr addr,
   FastState fast_state = thr->fast_state;
   if (UNLIKELY(fast_state.GetIgnoreBit()))
     return;
+#if !SANITIZER_GO
+  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
+    SimulateSchedule();
+#endif
   Shadow cur(fast_state, 0, 8, typ);
   RawShadow* shadow_mem = MemToShadow(addr);
   bool traced = false;
@@ -499,6 +508,10 @@ ALWAYS_INLINE USED void UnalignedMemoryAccess(ThreadState* thr, uptr pc,
   FastState fast_state = thr->fast_state;
   if (UNLIKELY(fast_state.GetIgnoreBit()))
     return;
+#if !SANITIZER_GO
+  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
+    SimulateSchedule();
+#endif
   RawShadow* shadow_mem = MemToShadow(addr);
   bool traced = false;
   uptr size1 = Min<uptr>(size, RoundUp(addr + 1, kShadowCell) - addr);
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
index 4e58305b582d5..a7fa32a8d37c6 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
@@ -23,6 +23,7 @@
 #include "tsan_platform.h"
 #include "tsan_report.h"
 #include "tsan_rtl.h"
+#include "tsan_simulate.h"
 #include "tsan_suppressions.h"
 #include "tsan_symbolize.h"
 #include "tsan_sync.h"
@@ -717,6 +718,9 @@ bool OutputReport(ThreadState *thr, ScopedReport &srep) {
   if (flags()->halt_on_error)
     Die();
   thr->current_report = nullptr;
+#if !SANITIZER_GO
+  SimulateReportRace();
+#endif
   return true;
 }
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
index 978d853b0bc7e..717041b9d3577 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
@@ -11,10 +11,11 @@
 //===----------------------------------------------------------------------===//
 
 #include "sanitizer_common/sanitizer_placement_new.h"
-#include "tsan_rtl.h"
 #include "tsan_mman.h"
 #include "tsan_platform.h"
 #include "tsan_report.h"
+#include "tsan_rtl.h"
+#include "tsan_simulate.h"
 #include "tsan_sync.h"
 
 namespace __tsan {
@@ -237,6 +238,9 @@ void ThreadContext::OnStarted(void *arg) {
 
 void ThreadFinish(ThreadState *thr) {
   DPrintf("#%d: ThreadFinish\n", thr->tid);
+#if !SANITIZER_GO
+  SimulateThreadFinish();
+#endif
   ThreadCheckIgnore(thr);
   if (thr->stk_addr && thr->stk_size)
     DontNeedShadowFor(thr->stk_addr, thr->stk_size);
diff --git a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
new file mode 100644
index 0000000000000..b9880472a3fed
--- /dev/null
+++ b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
@@ -0,0 +1,741 @@
+//===-- tsan_simulate.cpp -------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file is a part of ThreadSanitizer (TSan), a race detector.
+//
+//===----------------------------------------------------------------------===//
+
+#include "tsan_simulate.h"
+
+#include "interception/interception.h"
+#include "sanitizer_common/sanitizer_atomic.h"
+#include "sanitizer_common/sanitizer_errno.h"
+#include "sanitizer_common/sanitizer_placement_new.h"
+#include "tsan_flags.h"
+#include "tsan_rtl.h"
+
+extern "C" void* pthread_self();
+DECLARE_REAL(int, pthread_mutex_unlock, void* m)
+DECLARE_REAL(int, pthread_mutex_trylock, void* m)
+namespace __tsan {
+
+static constexpr int kMaxSimThreads = 64;
+
+static int sim_current_iteration = 0;
+
+static atomic_uint32_t sim_max_depth_hit;
+static atomic_uint32_t sim_race_detected;
+static atomic_uint32_t sim_unsupported_interceptor_called;
+
+void SimulateReportUnsupportedImpl(const char* func_name) {
+  atomic_store_relaxed(&sim_unsupported_interceptor_called, 1);
+  Printf(
+      "ThreadSanitizer: simulation error - unsupported interceptor called: "
+      "%s\n"
+      "Simulation does not support this synchronization primitive.\n",
+      func_name);
+}
+
+void SimulateReportRaceImpl() {
+  atomic_store_relaxed(&sim_race_detected, 1);
+  Printf("ThreadSanitizer: data race detected at iteration %d\n",
+         sim_current_iteration);
+}
+
+void SimulateReportDeadlock() {
+  Printf(
+      "ThreadSanitizer: deadlock detected at iteration %d - all threads are "
+      "blocked\n",
+      sim_current_iteration);
+  Printf(
+      "ThreadSanitizer: to reproduce, set "
+      "TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=%d\n",
+      sim_current_iteration);
+  Die();
+}
+
+namespace {
+
+struct SimThread {
+  enum State : u32 {
+    Unused = 0,
+    Runnable,  // Runnable - may be selected by the scheduler.
+    Blocked,  // Blocked on mutex/condvar - scheduler must not pick this thread.
+    Finished,  // Thread has exited the simulation.
+  };
+
+  Semaphore sem;
+  State state;
+  uptr thread_handle;  // This thread's pthread_t (from pthread_self())
+  uptr joining_on;     // pthread_t this thread is joining on (0 if not joining)
+};
+
+// Waitset: tracks threads blocked waiting for a resource (mutex or condvar).
+struct Waitset {
+  static constexpr int kMaxWaiters = kMaxSimThreads;
+  int waiters[kMaxWaiters];
+  int count;
+
+  Waitset() { Reset(); }
+
+  void Reset() {
+    count = 0;
+    internal_memset(waiters, 0, sizeof(waiters));
+  }
+
+  void AddWaiter(int thread_idx) {
+    CHECK_LT(count, kMaxWaiters);
+    waiters[count++] = thread_idx;
+  }
+
+  // Randomly select and remove one thread from the waitset.
+  // Matches Relacy's approach to maximize interleaving exploration.
+  int RemoveOne(u32* rng_state) {
+    CHECK_GT(count, 0);
+    // Pick a random thread from the waitset.
+    int idx = RandN(rng_state, count);
+    int thread_idx = waiters[idx];
+    // Remove it by shifting remaining threads.
+    for (int i = idx + 1; i < count; i++) waiters[i - 1] = waiters[i];
+    count--;
+    return thread_idx;
+  }
+
+  // Remove all threads and return count.
+  int RemoveAll(int* out_threads) {
+    int n = count;
+    for (int i = 0; i < count; i++) out_threads[i] = waiters[i];
+    count = 0;
+    return n;
+  }
+};
+
+struct WaitsetMap {
+  struct Element {
+    uptr addr;
+    Waitset waitset;
+  };
+
+  static constexpr int kMaxElements = 256;
+  Element elements[kMaxElements];
+  int count = 0;
+
+  Waitset* Find(uptr addr) {
+    for (int i = 0; i < count; i++)
+      if (elements[i].addr == addr)
+        return &elements[i].waitset;
+    return nullptr;
+  }
+
+  Waitset* GetOrCreate(uptr addr) {
+    Waitset* ws = Find(addr);
+    if (ws)
+      return ws;
+
+    CHECK_LT(count, kMaxElements);
+    int idx = count++;
+    elements[idx].addr = addr;
+    elements[idx].waitset.Reset();
+    return &elements[idx].waitset;
+  }
+
+  void Reset() { count = 0; }
+};
+
+// SimScheduler controls which thread runs at each scheduling point. Exactly one
+// thread is designated as "current" and executes user code. Other runnable
+// threads park on their per-thread semaphore until the scheduler selects them.
+class SimScheduler {
+ public:
+  SimScheduler() : current_(-1), thread_count_(0), depth_(0) {
+    internal_memset(threads_, 0, sizeof(threads_));
+  }
+
+  void ResetForIteration() {
+    current_ = -1;
+    thread_count_ = 0;
+    depth_ = 0;
+    internal_memset(threads_, 0, sizeof(threads_));
+    mutex_waitsets_.Reset();
+    cond_waitsets_.Reset();
+  }
+
+  void StartIteration(u32 seed) {
+    rng_state_ = seed;
+    depth_ = 0;
+    current_ = 0;
+  }
+
+  // ------- Main scheduling point -------
+  //
+  // Called by the currently running thread. May randomly switch to another
+  // runnable thread.
+  void Schedule(int caller_idx) {
+    if (atomic_load_relaxed(&sim_max_depth_hit))
+      return;
+
+    CHECK_EQ(caller_idx, current_);
+
+    int max_depth = flags()->simulate_max_depth;
+    if (++depth_ > max_depth) {
+      atomic_store_relaxed(&sim_max_depth_hit, 1);
+      Printf("ThreadSanitizer: simulation hit max depth %d at iteration %d\n",
+             max_depth, sim_current_iteration);
+    }
+
+    int runnable = CountRunnable();
+    if (runnable <= 1)
+      return;
+
+    int chosen = PickRandomRunnable(runnable);
+
+    DumpStates(chosen, caller_idx);
+
+    if (chosen == caller_idx)
+      return;
+
+    current_ = chosen;
+    threads_[chosen].sem.Post();
+    threads_[caller_idx].sem.Wait();
+  }
+
+  // Thread lifecycle methods
+
+  int RegisterThread() {
+    if (thread_count_ >= kMaxSimThreads) {
+      Printf(
+          "ThreadSanitizer: simulation error - max thread count %d exceeded\n",
+          kMaxSimThreads);
+      Die();
+    }
+
+    int idx = thread_count_++;
+    threads_[idx].state = SimThread::Runnable;
+    threads_[idx].thread_handle = 0;
+    threads_[idx].joining_on = 0;
+    return idx;
+  }
+
+  void SetThreadHandle(int idx, uptr handle) {
+    threads_[idx].thread_handle = handle;
+  }
+
+  void ThreadStart(int idx) {
+    CHECK_NE(current_, -1);
+    threads_[idx].sem.Wait();
+  }
+
+  void ThreadFinish(int idx) {
+    threads_[idx].state = SimThread::Finished;
+    uptr my_handle = threads_[idx].thread_handle;
+    CHECK_NE(my_handle, 0);
+
+    if (my_handle != 0) {
+      for (int i = 0; i < thread_count_; i++) {
+        if (threads_[i].joining_on == my_handle) {
+          threads_[i].state = SimThread::Runnable;
+          threads_[i].joining_on = 0;
+        }
+      }
+    }
+
+    // Clear the handle to allow pthread_t reuse
+    threads_[idx].thread_handle = 0;
+
+    if (idx != current_)
+      return;
+
+    // We were current. Pick next runnable thread.
+    PickNextAndWake();
+  }
+
+  // ------- Blocking-call support -------
+
+  // Called BEFORE a pthread_join call. Records the target pthread_t handle.
+  void BeforeJoinCall(int idx, uptr target_handle) {
+    threads_[idx].state = SimThread::Blocked;
+    threads_[idx].joining_on = target_handle;
+
+    if (idx == current_) {
+      PickNextAndWake();
+    }
+  }
+
+  // Check if a thread with the given pthread_t handle is still active
+  // (i.e., not Finished). Returns false if thread not found or already
+  // finished.
+  bool IsThreadActive(uptr thread_handle) {
+    for (int i = 0; i < thread_count_; i++) {
+      if (threads_[i].state == SimThread::Finished)
+        continue;
+      if (threads_[i].thread_handle == thread_handle)
+        return true;
+    }
+    return false;
+  }
+
+  // Called AFTER a blocking OS call returns. Marks this thread as Runnable
+  // (runnable) again. If no thread is currently running, this thread becomes
+  // current and returns immediately. Otherwise it parks until selected.
+  void AfterBlockingCall(int idx) {
+    threads_[idx].state = SimThread::Runnable;
+
+    if (current_ == -1)
+      current_ = idx;
+
+    // Another thread is running. Park until selected.
+    threads_[idx].sem.Wait();
+  }
+
+  int GetThreadCount() const { return thread_count_; }
+
+  void MutexBlock(int caller_idx, uptr mutex_addr) {
+    CHECK_EQ(caller_idx, current_);
+
+    Waitset* ws = mutex_waitsets_.GetOrCreate(mutex_addr);
+    ws->AddWaiter(caller_idx);
+
+    threads_[caller_idx].state = SimThread::Blocked;
+
+    PickNextAndWake();
+
+    // Park this thread until woken by unlock.
+    threads_[caller_idx].sem.Wait();
+  }
+
+  void MutexUnblock(uptr mutex_addr) {
+    Waitset* ws = mutex_waitsets_.Find(mutex_addr);
+
+    if (!ws || ws->count == 0)
+      return;
+
+    // Remove one waiter randomly and mark it as runnable.
+    int thread_idx = ws->RemoveOne(&rng_state_);
+    threads_[thread_idx].state = SimThread::Runnable;
+
+    // If no thread is current, make the unblocked thread current and wake it.
+    if (current_ == -1) {
+      current_ = thread_idx;
+      threads_[thread_idx].sem.Post();
+    }
+    // Otherwise it will be picked up by next Schedule() or when current
+    // finishes.
+  }
+
+  void CondWait(int caller_idx, uptr cond_addr, uptr mutex_addr) {
+    CHECK_EQ(caller_idx, current_);
+    if (caller_idx != current_)
+      return;
+
+    // Add this thread to the condvar's waitset.
+    Waitset* ws = cond_waitsets_.GetOrCreate(cond_addr);
+    ws->AddWaiter(caller_idx);
+
+    // Mark thread as blocked.
+    threads_[caller_idx].state = SimThread::Blocked;
+
+    // Pick next runnable thread and wake it.
+    PickNextAndWake();
+
+    // Park this thread until woken by signal/broadcast.
+    threads_[caller_idx].sem.Wait();
+  }
+
+  void CondSignal(uptr cond_addr) {
+    CHECK_NE(current_, -1);
+
+    Waitset* ws = cond_waitsets_.Find(cond_addr);
+
+    if (!ws || ws->count == 0)
+      return;
+
+    // Remove one waiter randomly and mark it as runnable.
+    int thread_idx = ws->RemoveOne(&rng_state_);
+    threads_[thread_idx].state = SimThread::Runnable;
+  }
+
+  void CondBroadcast(uptr cond_addr) {
+    CHECK_NE(current_, -1);
+
+    Waitset* ws = cond_waitsets_.Find(cond_addr);
+
+    if (!ws || ws->count == 0)
+      return;
+
+    int woken[kMaxSimThreads];
+    int n = ws->RemoveAll(woken);
+    for (int i = 0; i < n; i++) threads_[woken[i]].state = SimThread::Runnable;
+  }
+
+  bool ShouldSchedule() {
+    int schedule_probability_ = flags()->simulate_schedule_probability;
+    if (schedule_probability_ >= 100)
+      return true;
+    if (schedule_probability_ <= 0)
+      return false;
+    u32 rand_val = RandN(&rng_state_, 100);
+    return rand_val < static_cast<u32>(schedule_probability_);
+  }
+
+ private:
+  int CountRunnable() const {
+    int n = 0;
+    for (int i = 0; i < thread_count_; i++) {
+      if (threads_[i].state == SimThread::Runnable)
+        n++;
+    }
+    return n;
+  }
+
+  int PickRandomRunnable(int runnable) {
+    int target = RandN(&rng_state_, runnable);
+    for (int i = 0; i < thread_count_; i++) {
+      if (threads_[i].state == SimThread::Runnable) {
+        if (target == 0)
+          return i;
+        target--;
+      }
+    }
+    CHECK(false);  // should not reach here
+    return -1;
+  }
+
+  // Picks the next runnable thread and posts its semaphore, or sets current_ =
+  // -1 if none are runnable.
+  void PickNextAndWake() {
+    int runnable = CountRunnable();
+    if (runnable == 0) {
+      current_ = -1;
+      int blocked = 0;
+      for (int i = 0; i < thread_count_; i++)
+        if (threads_[i].state == SimThread::Blocked)
+          blocked++;
+
+      if (blocked > 0)
+        SimulateReportDeadlock();
+
+      // Should only hapen when the callback thread is exiting
+      return;
+    }
+
+    int chosen = PickRandomRunnable(runnable);
+    current_ = chosen;
+    threads_[chosen].sem.Post();
+  }
+
+  void DumpStates(int chosen = -1, int current = -1) {
+    if (common_flags()->verbosity >= 2) {
+      if (chosen >= 0) {
+        Printf("Chose tid %d to run", chosen);
+        if (current >= 0)
+          Printf(" (current %d)", current);
+        Printf(" - ");
+      }
+      Printf("Thread states: ");
+      for (int i = 0; i < thread_count_; i++) {
+        const char* state_str = "?";
+        switch (threads_[i].state) {
+          case SimThread::Unused:
+            state_str = "Unused";
+            break;
+          case SimThread::Runnable:
+            state_str = "Runnable";
+            break;
+          case SimThread::Blocked:
+            state_str = "Blocked";
+            break;
+          case SimThread::Finished:
+            state_str = "Finished";
+            break;
+        }
+        Printf("[%d:%s] ", i, state_str);
+      }
+      Printf("\n");
+    }
+  }
+
+ private:
+ public:
+  u32 rng_state_ = 0;
+  SimThread threads_[kMaxSimThreads];
+  int current_;
+  int thread_count_;
+  int depth_;
+
+  // Resource waitsets: map from resource address to waitset.
+  WaitsetMap mutex_waitsets_;
+  WaitsetMap cond_waitsets_;
+};
+
+}  // namespace
+
+// ---------------------------------------------------------------------------
+// Global state
+// ---------------------------------------------------------------------------
+
+bool sim_active;
+
+// Pointer to the current scheduler instance (valid while sim_active == true).
+static SimScheduler* sim_sched;
+
+class SimStateGuard {
+  SimScheduler* sched_;
+
+ public:
+  SimStateGuard(SimScheduler* sched) : sched_(sched) { sim_active = true; }
+  ~SimStateGuard() {
+    sim_active = false;
+    cur_thread()->sim_thread_idx = -1;
+    sim_sched = nullptr;
+    if (sched_) {
+      sched_->~SimScheduler();
+      InternalFree(sched_);
+    }
+  }
+  SimStateGuard(const SimStateGuard&) = delete;
+  SimStateGuard& operator=(const SimStateGuard&) = delete;
+};
+
+void SimulateScheduleImpl() {
+  ThreadState* thr = cur_thread();
+  CHECK_GE(thr->sim_thread_idx, 0);
+  if (!sim_sched->ShouldSchedule())
+    return;
+
+  if (flags()->simulate_print_schedule_stacks) {
+    Printf("=========== Schedule point (thread %d) ===========\n",
+           thr->sim_thread_idx);
+    PrintCurrentStack(thr, StackTrace::GetCurrentPc());
+    Printf("==================================================\n");
+  }
+
+  CHECK_GE(thr->sim_thread_idx, 0);
+  sim_sched->Schedule(thr->sim_thread_idx);
+}
+
+void SimulateThreadRegisterImpl(uptr thread_handle) {
+  ThreadState* thr = cur_thread();
+  thr->sim_thread_idx = sim_sched->RegisterThread();
+  sim_sched->SetThreadHandle(thr->sim_thread_idx, thread_handle);
+}
+
+void SimulateBeforeChildThreadRunsImpl() {
+  ThreadState* thr = cur_thread();
+  CHECK_GE(thr->sim_thread_idx, 0);
+  sim_sched->ThreadStart(thr->sim_thread_idx);
+}
+
+void SimulateThreadFinishImpl() {
+  ThreadState* thr = cur_thread();
+  int idx = thr->sim_thread_idx;
+  CHECK_GE(idx, 0);
+  thr->sim_thread_idx = -1;
+  sim_sched->ThreadFinish(idx);
+}
+
+bool SimulateJoinBlockImpl(uptr thread_handle) {
+  ThreadState* thr = cur_thread();
+  CHECK_GE(thr->sim_thread_idx, 0);
+  // Only mark ourselves as blocked if the target thread is still active.
+  // If it's already finished, pthread_join will return immediately.
+  if (sim_sched->IsThreadActive(thread_handle)) {
+    sim_sched->BeforeJoinCall(thr->sim_thread_idx, thread_handle);
+    return true;
+  }
+  return false;
+}
+
+void SimulateJoinResumeImpl() {
+  // After BLOCK_REAL(pthread_join) returns, the target thread's ThreadFinish
+  // marked us as Runnable and PickNextAndWake may have posted our semaphore.
+  // We must consume that post to re-sync with the scheduler, otherwise the
+  // pending post causes a future sem.Wait() to return spuriously, allowing
+  // two threads to run simultaneously.
+  sim_sched->threads_[cur_thread()->sim_thread_idx].sem.Wait();
+}
+
+void SimulateThreadUnblockImpl() {
+  ThreadState* thr = cur_thread();
+  CHECK_GE(thr->sim_thread_idx, 0);
+  sim_sched->AfterBlockingCall(thr->sim_thread_idx);
+}
+
+void SimulateMutexBlockImpl(uptr mutex_addr) {
+  ThreadState* thr = cur_thread();
+  CHECK_GE(thr->sim_thread_idx, 0);
+  sim_sched->MutexBlock(thr->sim_thread_idx, mutex_addr);
+}
+
+void SimulateMutexUnblockImpl(uptr mutex_addr) {
+  sim_sched->MutexUnblock(mutex_addr);
+}
+
+void SimulateCondSignalImpl(uptr cond_addr) {
+  sim_sched->CondSignal(cond_addr);
+}
+
+void SimulateCondBroadcastImpl(uptr cond_addr) {
+  sim_sched->CondBroadcast(cond_addr);
+}
+
+int CheckForErors(int iter, int start_iter) {
+  if (atomic_load_relaxed(&sim_unsupported_interceptor_called)) {
+    Printf("ThreadSanitizer: unsupported interceptor at iteration %d\n", iter);
+    Printf(
+        "ThreadSanitizer: to reproduce, set "
+        "TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=%d\n",
+        iter);
+    Printf("ThreadSanitizer: simulation aborted after %d iterations\n",
+           iter - start_iter + 1);
+    return -1;
+  }
+
+  if (atomic_load_relaxed(&sim_max_depth_hit)) {
+    Printf(
+        "ThreadSanitizer: to reproduce, set "
+        "TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=%d\n",
+        iter);
+    Printf(
+        "ThreadSanitizer: simulation stopped due to max depth after %d "
+        "iterations\n",
+        iter - start_iter + 1);
+    return -1;
+  }
+
+  if (atomic_load_relaxed(&sim_race_detected)) {
+    Printf(
+        "ThreadSanitizer: to reproduce, set "
+        "TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration=%d\n",
+        iter);
+    Printf(
+        "ThreadSanitizer: simulation stopped due to race detection after %d "
+        "iterations\n",
+        iter - start_iter + 1);
+    return -1;
+  }
+
+  return 0;
+}
+
+int SimulateRun(void (*callback)(void*), void* arg) {
+  const char* sched = flags()->simulate_scheduler;
+  if (!sched || !sched[0] || internal_strcmp(sched, "random") != 0) {
+    callback(arg);
+    return 0;
+  }
+
+  uptr running_threads = 0;
+  ctx->thread_registry.GetNumberOfThreads(nullptr, &running_threads, nullptr);
+  if (running_threads > 1) {
+    Printf(
+        "ThreadSanitizer: simulation cannot start - other threads are "
+        "running (%zu threads detected).\n"
+        "Simulation requires that only the calling thread exists. "
+        "Not running callback\n",
+        running_threads);
+    return -1;
+  }
+
+  atomic_store_relaxed(&sim_unsupported_interceptor_called, 0);
+  atomic_store_relaxed(&sim_max_depth_hit, 0);
+  atomic_store_relaxed(&sim_race_detected, 0);
+
+  int iterations = flags()->simulate_iterations;
+  if (iterations <= 0) {
+    Printf("ThreadSanitizer: simulate_iterations must be > 0 (got %d)\n",
+           iterations);
+    return -1;
+  }
+
+  int start_iter = flags()->simulate_start_iteration;
+  if (start_iter < 0) {
+    Printf("ThreadSanitizer: simulate_start_iteration must be >= 0 (got %d)\n",
+           start_iter);
+    return -1;
+  }
+
+  int prob = flags()->simulate_schedule_probability;
+  if (prob < 0 || prob > 100) {
+    Printf(
+        "ThreadSanitizer: simulate_schedule_probabilitymust be >=0 and <= 100 "
+        "(got %d)\n",
+        prob);
+    return -1;
+  }
+
+  int max_depth = flags()->simulate_max_depth;
+  Printf(
+      "ThreadSanitizer: simulation starting (iterations %d..%d, max_depth=%d, "
+      "scheduler=%s)\n",
+      start_iter, start_iter + iterations - 1, max_depth, sched);
+
+  void* sched_mem = InternalAlloc(sizeof(SimScheduler));
+  SimScheduler* sched_ptr = new (sched_mem) SimScheduler();
+  sim_sched = sched_ptr;
+
+  SimStateGuard guard(sched_ptr);
+
+  for (int iter = start_iter; iter < start_iter + iterations; iter++) {
+    sim_current_iteration = iter;
+
+    sched_ptr->ResetForIteration();
+
+    int main_idx = sched_ptr->RegisterThread();
+    CHECK_EQ(main_idx, 0);
+    cur_thread()->sim_thread_idx = main_idx;
+    sched_ptr->SetThreadHandle(main_idx, (uptr)pthread_self());
+
+    sched_ptr->StartIteration(iter);
+
+    DPrintf(1, "Start callback iter=%d\n", iter);
+    callback(arg);
+    DPrintf(1, "End callback iter=%d\n", iter);
+
+    if (iter == start_iter && sched_ptr->GetThreadCount() == 1) {
+      Printf("ThreadSanitizer: simulation exiting - no threads were spawned\n");
+      return 0;
+    }
+
+    if (int rc = CheckForErors(iter, start_iter); rc)
+      return rc;
+
+    sched_ptr->ThreadFinish(main_idx);
+  }
+
+  Printf("ThreadSanitizer: simulation finished (%d iterations)\n", iterations);
+  return 0;
+}
+
+int SimulateCondWait(ThreadState* thr, uptr pc, void* c, void* m) {
+  int res = REAL(pthread_mutex_unlock)(m);
+  CHECK_EQ(res, 0);
+
+  SimulateMutexUnblockImpl((uptr)m);
+
+  int idx = cur_thread()->sim_thread_idx;
+  CHECK_GE(idx, 0);
+  sim_sched->CondWait(idx, (uptr)c, (uptr)m);
+
+  // After waking, re-acquire the mutex (mimicking pthread_cond_wait
+  // behavior).
+  SimulateSchedule();
+  while (true) {
+    res = REAL(pthread_mutex_trylock)(m);
+    if (res == 0 || res == errno_EOWNERDEAD)
+      break;
+    if (res != errno_EBUSY) {
+      // Some other error - give up.
+      MutexPostLock(thr, pc, (uptr)m, MutexFlagDoPreLockOnPostLock);
+      return res;
+    }
+    SimulateMutexBlockImpl((uptr)m);
+  }
+  return res;
+}
+
+}  // namespace __tsan
diff --git a/compiler-rt/lib/tsan/rtl/tsan_simulate.h b/compiler-rt/lib/tsan/rtl/tsan_simulate.h
new file mode 100644
index 0000000000000..010cba65c5b5c
--- /dev/null
+++ b/compiler-rt/lib/tsan/rtl/tsan_simulate.h
@@ -0,0 +1,154 @@
+//===-- tsan_simulate.h -----------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file is a part of ThreadSanitizer (TSan), a race detector.
+//
+// Simulation scheduler for systematic thread interleaving exploration.
+// Inspired by Relacy Race Detector's random scheduler:
+// https://github.com/dvyukov/relacy
+//
+// When simulation is active, exactly one application thread runs at a time.
+// Other threads are parked on internal semaphores. At each sync point
+// (pthread_* calls, atomic operations), the running thread may yield to another
+// thread chosen by the scheduler.
+//===----------------------------------------------------------------------===//
+
+#ifndef TSAN_SIMULATE_H
+#define TSAN_SIMULATE_H
+
+#include "sanitizer_common/sanitizer_internal_defs.h"
+
+namespace __tsan {
+
+// TODO: Simulation would be more useful with the following features
+//  - Read/write mutex support
+//  - Timed pthread* API support
+//  - std::atomic::wait/notify* support (doesn't work today, because these APIs
+//    rely on direct OS futex calls which TSAN does not observe)
+//  - Alternate scheduling algorithms like full search or other random
+//    distributions
+
+// Run the simulation: invoke `callback(arg)` for `simulate_iterations`
+// iterations, exploring thread interleavings using the configured scheduler.
+// Returns 0 on success, -1 on error.
+//
+// Errors include
+//  - Pre-existing threads when simulation was started
+//  - Unsupported interceptor
+//  - Max simulation depth hit
+//  - Race detected
+//  - Deadlock detected (all simulated threads were blocked)
+//    Deadlock results in program termination via Die()
+//
+// If an unsupported interceptor is invoked, the simlulation enters undefined
+// behavior from the ThreadSanitizer simulation perspective. The interceptor
+// may lead to the simulation being unable to advance (deadlocked), or the
+// simulation may eventually be able to return out from SimulateRun.
+int SimulateRun(void (*callback)(void*), void* arg);
+
+extern bool sim_active;
+
+ALWAYS_INLINE bool SimulateIsActive() { return sim_active; }
+
+void SimulateScheduleImpl();
+void SimulateReportUnsupportedImpl(const char* func_name);
+void SimulateReportRaceImpl();
+void SimulateThreadRegisterImpl(uptr thread_handle);
+void SimulateBeforeChildThreadRunsImpl();
+void SimulateThreadFinishImpl();
+
+// SimulateSchedule is the key hook for simulation. It's called at each
+// scheduling point (atomic op, mutex/cv op, thread create/join). When
+// simulation is active, SimulateSchedule will check if another thread should
+// run, and if so, context switch to that thread.
+ALWAYS_INLINE void SimulateSchedule() {
+  if (!SimulateIsActive())
+    return;
+  SimulateScheduleImpl();
+}
+
+// Thread lifecycle
+
+ALWAYS_INLINE void SimulateThreadRegister(uptr thread_handle) {
+  if (!SimulateIsActive())
+    return;
+  SimulateThreadRegisterImpl(thread_handle);
+}
+
+ALWAYS_INLINE void SimulateBeforeChildThreadRuns() {
+  if (!SimulateIsActive())
+    return;
+  SimulateBeforeChildThreadRunsImpl();
+}
+
+ALWAYS_INLINE void SimulateThreadFinish() {
+  if (!SimulateIsActive())
+    return;
+  SimulateThreadFinishImpl();
+}
+
+// Mutex/cv ops
+
+void SimulateMutexBlockImpl(uptr mutex_addr);
+void SimulateMutexUnblockImpl(uptr mutex_addr);
+void SimulateCondSignalImpl(uptr cond_addr);
+void SimulateCondBroadcastImpl(uptr cond_addr);
+
+ALWAYS_INLINE void SimulateMutexBlock(uptr mutex_addr) {
+  if (!SimulateIsActive())
+    return;
+  SimulateMutexBlockImpl(mutex_addr);
+}
+
+ALWAYS_INLINE void SimulateMutexUnblock(uptr mutex_addr) {
+  if (!SimulateIsActive())
+    return;
+  SimulateMutexUnblockImpl(mutex_addr);
+}
+
+ALWAYS_INLINE void SimulateCondSignal(uptr cond_addr) {
+  if (!SimulateIsActive())
+    return;
+  SimulateCondSignalImpl(cond_addr);
+}
+
+ALWAYS_INLINE void SimulateCondBroadcast(uptr cond_addr) {
+  if (!SimulateIsActive())
+    return;
+  SimulateCondBroadcastImpl(cond_addr);
+}
+
+bool SimulateJoinBlockImpl(uptr thread_handle);
+void SimulateJoinResumeImpl();
+template <class JoinFunction>
+int SimulateJoin(void* th, void** ret, JoinFunction join_function) {
+  bool sim_blocked = SimulateJoinBlockImpl((uptr)th);
+  int res = join_function(th, ret);
+  if (sim_blocked)
+    SimulateJoinResumeImpl();
+  return res;
+}
+
+struct ThreadState;
+int SimulateCondWait(ThreadState* thr, uptr pc, void* c, void* m);
+
+ALWAYS_INLINE void SimulateReportUnsupported(const char* func_name) {
+  if (!SimulateIsActive())
+    return;
+  SimulateReportUnsupportedImpl(func_name);
+}
+
+ALWAYS_INLINE void SimulateReportRace() {
+  if (!SimulateIsActive())
+    return;
+  SimulateReportRaceImpl();
+}
+
+}  // namespace __tsan
+
+#endif  // TSAN_SIMULATE_H
diff --git a/compiler-rt/test/tsan/simulate_cond_signal.cpp b/compiler-rt/test/tsan/simulate_cond_signal.cpp
new file mode 100644
index 0000000000000..1994e64056e40
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_cond_signal.cpp
@@ -0,0 +1,62 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+pthread_cond_t cond;
+int ready = 0;
+int woken_count = 0;
+
+void *waiter_thread(void *arg) {
+  pthread_mutex_lock(&mutex);
+  while (ready == 0)
+    pthread_cond_wait(&cond, &mutex);
+  woken_count++;
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void *signaler_thread(void *arg) {
+  // Signal twice to wake both waiters
+  pthread_mutex_lock(&mutex);
+  ready = 1;
+  pthread_mutex_unlock(&mutex);
+
+  pthread_cond_signal(&cond);
+  pthread_cond_signal(&cond);
+
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  ready = 0;
+  woken_count = 0;
+  pthread_mutex_init(&mutex, nullptr);
+  pthread_cond_init(&cond, nullptr);
+
+  pthread_t waiter1, waiter2, signaler;
+
+  pthread_create(&waiter1, nullptr, waiter_thread, nullptr);
+  pthread_create(&waiter2, nullptr, waiter_thread, nullptr);
+
+  pthread_create(&signaler, nullptr, signaler_thread, nullptr);
+
+  pthread_join(signaler, nullptr);
+  pthread_join(waiter1, nullptr);
+  pthread_join(waiter2, nullptr);
+
+  pthread_cond_destroy(&cond);
+  pthread_mutex_destroy(&mutex);
+
+  assert(ready == 1);
+  assert(woken_count == 2);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp b/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
new file mode 100644
index 0000000000000..c82cdf0f20f9b
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
@@ -0,0 +1,47 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+#include <unistd.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+pthread_cond_t condvar;
+
+void *thread_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  // Wait on condition variable that will never be signaled
+  pthread_cond_wait(&condvar, &mutex);
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_mutex_init(&mutex, nullptr);
+  pthread_cond_init(&condvar, nullptr);
+
+  pthread_t t1;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+
+  pthread_join(t1, nullptr);
+
+  assert(false); // never hit
+
+  pthread_cond_destroy(&condvar);
+  pthread_mutex_destroy(&mutex);
+}
+
+int main() {
+  alarm(10); // Test timeout
+  __tsan_simulate(test_callback, nullptr);
+
+  // Deadlock will cause Die() - this will not return
+  assert(false);
+  return 1;
+}
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: deadlock detected at iteration {{[0-9]+}} - all threads are blocked
+// CHECK: ThreadSanitizer: to reproduce, set TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration={{[0-9]+}}
diff --git a/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp b/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
new file mode 100644
index 0000000000000..cc2e66ad4c4ba
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
@@ -0,0 +1,69 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 not %run %t 2>&1 | FileCheck %s
+
+// Test condition variable missing broadcast deadlock.
+// Scenario: Two threads wait on condition, but only one signal is sent
+// Result: One waiter is left blocked forever -> deadlock
+
+#include <assert.h>
+#include <pthread.h>
+#include <unistd.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+pthread_cond_t cond;
+int c = 0;
+
+void *thread1_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  c = 1;
+  pthread_cond_signal(&cond); // Only wakes ONE thread!
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void *thread2_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  while (c != 1)
+    pthread_cond_wait(&cond, &mutex);
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void *thread3_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  while (c != 1)
+    pthread_cond_wait(&cond, &mutex);
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_mutex_init(&mutex, nullptr);
+  pthread_cond_init(&cond, nullptr);
+
+  c = 0;
+  pthread_t t1, t2, t3;
+  pthread_create(&t3, nullptr, thread3_func, nullptr);
+  pthread_create(&t2, nullptr, thread2_func, nullptr);
+  pthread_create(&t1, nullptr, thread1_func, nullptr);
+
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+  pthread_join(t3, nullptr);
+
+  pthread_cond_destroy(&cond);
+  pthread_mutex_destroy(&mutex);
+}
+
+int main() {
+  alarm(10); // Test timeout
+  __tsan_simulate(test_callback, nullptr);
+  assert(false);
+  return 1;
+}
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: deadlock detected at iteration {{[0-9]+}} - all threads are blocked
+// CHECK: ThreadSanitizer: to reproduce, set TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration={{[0-9]+}}
diff --git a/compiler-rt/test/tsan/simulate_deadlock_simple.cpp b/compiler-rt/test/tsan/simulate_deadlock_simple.cpp
new file mode 100644
index 0000000000000..b1010ff09483d
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_deadlock_simple.cpp
@@ -0,0 +1,54 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 not %run %t 2>&1 | FileCheck %s
+//
+// Test simple deadlock potential detection: 2 threads, 2 mutexes, circular dependency.
+// Thread 1: lock(A) -> lock(B)
+// Thread 2: lock(B) -> lock(A)
+// TSAN should detect the lock-order-inversion (potential deadlock).
+
+#include <pthread.h>
+#include <unistd.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex_a;
+pthread_mutex_t mutex_b;
+
+void *thread1_func(void *arg) {
+  pthread_mutex_lock(&mutex_a);
+  pthread_mutex_lock(&mutex_b);
+
+  pthread_mutex_unlock(&mutex_b);
+  pthread_mutex_unlock(&mutex_a);
+  return nullptr;
+}
+
+void *thread2_func(void *arg) {
+  pthread_mutex_lock(&mutex_b);
+  pthread_mutex_lock(&mutex_a);
+
+  pthread_mutex_unlock(&mutex_a);
+  pthread_mutex_unlock(&mutex_b);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_mutex_init(&mutex_a, nullptr);
+  pthread_mutex_init(&mutex_b, nullptr);
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread1_func, nullptr);
+  pthread_create(&t2, nullptr, thread2_func, nullptr);
+
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+
+  pthread_mutex_destroy(&mutex_a);
+  pthread_mutex_destroy(&mutex_b);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
+// CHECK: Cycle in lock order graph
diff --git a/compiler-rt/test/tsan/simulate_double_join.cpp b/compiler-rt/test/tsan/simulate_double_join.cpp
new file mode 100644
index 0000000000000..75fc42392bb10
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_double_join.cpp
@@ -0,0 +1,27 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=5 %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+void *thread_func(void *arg) { return nullptr; }
+
+void test_callback(void *arg) {
+  pthread_t t;
+
+  int id1 = 1;
+  pthread_create(&t, nullptr, thread_func, &id1);
+  int res1 = pthread_join(t, nullptr);
+  assert(res1 == 0);
+
+  int id2 = 2;
+  pthread_create(&t, nullptr, thread_func, &id2);
+  int res2 = pthread_join(t, nullptr);
+  assert(res2 == 0);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: simulation starting
diff --git a/compiler-rt/test/tsan/simulate_empty_test.cpp b/compiler-rt/test/tsan/simulate_empty_test.cpp
new file mode 100644
index 0000000000000..f4a86855618d8
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_empty_test.cpp
@@ -0,0 +1,24 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:simulate_scheduler=random %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+#include <stdio.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+static int called;
+void test_callback(void *arg) {
+  ++called;
+  fprintf(stderr, "Callback executed with no threads\n");
+}
+
+int main() {
+  int result = __tsan_simulate(test_callback, nullptr);
+  assert(called == 1);
+  return result;
+}
+
+// CHECK: ThreadSanitizer: simulation starting (iterations 0..
+// CHECK: Callback executed with no threads
+// CHECK: ThreadSanitizer: simulation exiting - no threads were spawned
diff --git a/compiler-rt/test/tsan/simulate_immediate_exit.cpp b/compiler-rt/test/tsan/simulate_immediate_exit.cpp
new file mode 100644
index 0000000000000..76c319b03b9f0
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_immediate_exit.cpp
@@ -0,0 +1,39 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+#include <stdio.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+void *thread_func(void *arg) { return nullptr; }
+
+void test_callback(void *arg) {
+  pthread_t threads[5];
+
+  for (int i = 0; i < 5; i++) {
+    pthread_create(&threads[i], nullptr, thread_func, nullptr);
+  }
+
+  for (int i = 0; i < 5; i++) {
+    pthread_join(threads[i], nullptr);
+  }
+
+  fprintf(stderr, "All immediate-exit threads joined successfully\n");
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK: All immediate-exit threads joined successfully
+// CHECK-NOT: All immediate-exit threads joined successfully
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_invalid_iterations.cpp b/compiler-rt/test/tsan/simulate_invalid_iterations.cpp
new file mode 100644
index 0000000000000..77ecdfafae6da
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_invalid_iterations.cpp
@@ -0,0 +1,16 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=simulate_scheduler=random:simulate_iterations=0 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-ZERO
+// RUN: %env_tsan_opts=simulate_scheduler=random:simulate_iterations=-1 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-NEGATIVE
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+void test_callback(void *arg) { assert(0); }
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-ZERO: ThreadSanitizer: simulate_iterations must be > 0 (got 0)
+
+// CHECK-NEGATIVE: ThreadSanitizer: simulate_iterations must be > 0 (got -1)
diff --git a/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp b/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
new file mode 100644
index 0000000000000..89019e6a94606
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
@@ -0,0 +1,16 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=simulate_scheduler=random:simulate_iterations=10:simulate_start_iteration=-1 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-NEG1
+// RUN: %env_tsan_opts=simulate_scheduler=random:simulate_iterations=10:simulate_start_iteration=-5 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-NEG5
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+void test_callback(void *arg) { assert(0); }
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-NEG1: ThreadSanitizer: simulate_start_iteration must be >= 0 (got -1)
+
+// CHECK-NEG5: ThreadSanitizer: simulate_start_iteration must be >= 0 (got -5)
diff --git a/compiler-rt/test/tsan/simulate_iterations.cpp b/compiler-rt/test/tsan/simulate_iterations.cpp
new file mode 100644
index 0000000000000..82fdcc780b204
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_iterations.cpp
@@ -0,0 +1,55 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=1 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-1
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-10
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=100 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-100
+
+#include <assert.h>
+#include <pthread.h>
+#include <stdio.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+int counter = 0;
+int total_runs = 0;
+
+void *thread_func(void *arg) {
+  for (int i = 0; i != 10; ++i) {
+    pthread_mutex_lock(&mutex);
+    counter++;
+    pthread_mutex_unlock(&mutex);
+  }
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  ++total_runs;
+  counter = 0;
+  pthread_mutex_init(&mutex, nullptr);
+
+  pthread_t ts[10];
+  for (auto &t : ts)
+    pthread_create(&t, nullptr, thread_func, nullptr);
+
+  for (auto &t : ts)
+    pthread_join(t, nullptr);
+
+  assert(counter == 100);
+
+  pthread_mutex_destroy(&mutex);
+}
+
+int main() {
+  int result = __tsan_simulate(test_callback, nullptr);
+  fprintf(stderr, "total_runs=%d\n", total_runs);
+  return result;
+}
+
+// CHECK-1: ThreadSanitizer: simulation starting (iterations 0..0
+// CHECK-1: total_runs=1{{$}}
+
+// CHECK-10: ThreadSanitizer: simulation starting (iterations 0..9
+// CHECK-10: total_runs=10{{$}}
+//
+// CHECK-100: ThreadSanitizer: simulation starting (iterations 0..9
+// CHECK-100: total_runs=100{{$}}
diff --git a/compiler-rt/test/tsan/simulate_join_many_threads.cpp b/compiler-rt/test/tsan/simulate_join_many_threads.cpp
new file mode 100644
index 0000000000000..f6599efe95d3d
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_join_many_threads.cpp
@@ -0,0 +1,52 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s
+
+// Test nested thread join chain scenario - should work correctly.
+// Scenario: T1 creates/joins T2, T2 creates/joins T3, ... up to 16 levels.
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+constexpr int kMaxLevels = 16;
+
+int counter = 0;
+pthread_mutex_t mutex;
+
+struct ThreadArg {
+  int level;
+};
+
+void *thread_chain_func(void *arg) {
+  ThreadArg *thread_arg = static_cast<ThreadArg *>(arg);
+  if (thread_arg->level >= kMaxLevels)
+    return nullptr;
+
+  pthread_mutex_lock(&mutex);
+  counter++;
+  pthread_mutex_unlock(&mutex);
+
+  pthread_t child;
+  ThreadArg child_arg = {thread_arg->level + 1};
+  pthread_create(&child, nullptr, thread_chain_func, &child_arg);
+  pthread_join(child, nullptr);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  counter = 0;
+  pthread_t root;
+  ThreadArg root_arg = {1};
+
+  pthread_mutex_init(&mutex, nullptr);
+  pthread_create(&root, nullptr, thread_chain_func, &root_arg);
+  pthread_join(root, nullptr);
+  pthread_mutex_destroy(&mutex);
+
+  assert(counter == 15);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
diff --git a/compiler-rt/test/tsan/simulate_max_depth_hit.cpp b/compiler-rt/test/tsan/simulate_max_depth_hit.cpp
new file mode 100644
index 0000000000000..cb38ad88a339c
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_max_depth_hit.cpp
@@ -0,0 +1,41 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=100:simulate_max_depth=100 not %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <atomic>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+static std::atomic<int> counter(0);
+
+void *thread_func(void *arg) {
+  for (int i = 0; i < 200; i++) {
+    counter.fetch_add(1, std::memory_order_relaxed);
+  }
+  return nullptr;
+}
+
+static int called;
+
+void test_callback(void *arg) {
+  counter.store(0, std::memory_order_relaxed);
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+
+  ++called;
+}
+
+int main() {
+  called = 0;
+  int result = __tsan_simulate(test_callback, nullptr);
+  assert(called == 1);
+  return result;
+}
+
+// CHECK: ThreadSanitizer: simulation stopped due to max depth
diff --git a/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp b/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
new file mode 100644
index 0000000000000..86714e99dbc1d
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
@@ -0,0 +1,51 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+const int num_mutexes = 10;
+const int num_threads = 5;
+pthread_mutex_t mutexes[num_mutexes];
+int counter = 0;
+
+void *thread_func(void *arg) {
+  // Lock all mutexes in order
+  for (int i = 0; i < num_mutexes; i++)
+    pthread_mutex_lock(&mutexes[i]);
+
+  // Critical section: increment counter
+  counter++;
+
+  // Unlock all mutexes in reverse order
+  for (int i = num_mutexes - 1; i >= 0; i--)
+    pthread_mutex_unlock(&mutexes[i]);
+
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  for (int i = 0; i < num_mutexes; i++)
+    pthread_mutex_init(&mutexes[i], nullptr);
+  counter = 0;
+
+  pthread_t threads[num_threads];
+
+  for (int i = 0; i < num_threads; i++)
+    pthread_create(&threads[i], nullptr, thread_func, nullptr);
+
+  for (int i = 0; i < num_threads; i++)
+    pthread_join(threads[i], nullptr);
+
+  assert(counter == num_threads);
+
+  for (int i = 0; i < num_mutexes; i++)
+    pthread_mutex_destroy(&mutexes[i]);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_mutex_contention.cpp b/compiler-rt/test/tsan/simulate_mutex_contention.cpp
new file mode 100644
index 0000000000000..b52d108a2cf27
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_mutex_contention.cpp
@@ -0,0 +1,41 @@
+// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mtx;
+int shared = 0;
+
+void *thread_func(void *arg) {
+  for (int i = 0; i < 10; i++) {
+    pthread_mutex_lock(&mtx);
+    shared++;
+    pthread_mutex_unlock(&mtx);
+  }
+  return nullptr;
+}
+
+void test_callback(void *) {
+  shared = 0;
+
+  pthread_mutex_init(&mtx, nullptr);
+
+  const int kThreads = 4;
+  pthread_t threads[kThreads];
+
+  for (int i = 0; i < kThreads; i++)
+    pthread_create(&threads[i], nullptr, thread_func, nullptr);
+
+  for (int i = 0; i < kThreads; i++)
+    pthread_join(threads[i], nullptr);
+
+  assert(shared == kThreads * 10);
+
+  pthread_mutex_destroy(&mtx);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-NOT: WARNING: ThreadSanitizer: data race
diff --git a/compiler-rt/test/tsan/simulate_nested_create.cpp b/compiler-rt/test/tsan/simulate_nested_create.cpp
new file mode 100644
index 0000000000000..3bb696fbc24f6
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_nested_create.cpp
@@ -0,0 +1,65 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 %run %t 2>&1 | FileCheck %s
+//
+// Test threads creating other threads (nested thread creation).
+// Verifies that thread tracking handles hierarchical thread creation.
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+int counter = 0;
+
+void *level3_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  counter++;
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void *level2_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  counter++;
+  pthread_mutex_unlock(&mutex);
+
+  // Level 2 creates Level 3
+  pthread_t t;
+  pthread_create(&t, nullptr, level3_func, nullptr);
+  pthread_join(t, nullptr);
+
+  return nullptr;
+}
+
+void *level1_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  counter++;
+  pthread_mutex_unlock(&mutex);
+
+  // Level 1 creates Level 2
+  pthread_t t;
+  pthread_create(&t, nullptr, level2_func, nullptr);
+  pthread_join(t, nullptr);
+
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  counter = 0;
+  pthread_mutex_init(&mutex, nullptr);
+
+  // Main creates Level 1
+  pthread_t t;
+  pthread_create(&t, nullptr, level1_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_mutex_destroy(&mutex);
+
+  assert(counter == 3);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp b/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
new file mode 100644
index 0000000000000..3068eba4fb3d7
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
@@ -0,0 +1,38 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=1000 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-prob100
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_schedule_probability=0:simulate_iterations=1000 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-prob0
+
+// Standard TSAN rarely detect the race below. Simulation nails it very fast.
+
+#include <atomic>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+std::atomic<int> d{};
+int a = 0;
+
+void *thread_func(void *arg) {
+  ++d;
+  ++a;
+  ++d;
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-prob100: ThreadSanitizer: simulation starting
+// CHECK-prob100: WARNING: ThreadSanitizer: data race
+// CHECK-prob100: Write of size 4
+// CHECK-prob100: Previous write of size 4
+
+// CHECK-prob0: ThreadSanitizer: simulation starting
+// CHECK-prob0-NOT: WARNING: ThreadSanitizer: data race
diff --git a/compiler-rt/test/tsan/simulate_probability.cpp b/compiler-rt/test/tsan/simulate_probability.cpp
new file mode 100644
index 0000000000000..cc4b0b2e83e0b
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_probability.cpp
@@ -0,0 +1,47 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10:simulate_probability=0.5 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-PROB50
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10:simulate_probability=1.0 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-PROB100
+//
+// This is a basic functional test that the parameter works; no
+// validation of the probabilities are done by the test.
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+int counter = 0;
+
+void *thread_func(void *arg) {
+  for (int i = 0; i < 10; i++) {
+    pthread_mutex_lock(&mutex);
+    counter++;
+    pthread_mutex_unlock(&mutex);
+  }
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  counter = 0;
+  pthread_mutex_init(&mutex, nullptr);
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+
+  pthread_mutex_destroy(&mutex);
+
+  assert(counter == 20);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-PROB50: ThreadSanitizer: simulation starting
+// CHECK-PROB50: ThreadSanitizer: simulation finished
+
+// CHECK-PROB100: ThreadSanitizer: simulation starting
+// CHECK-PROB100: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_race_basic.cpp b/compiler-rt/test/tsan/simulate_race_basic.cpp
new file mode 100644
index 0000000000000..ebecb7d76bdbc
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_race_basic.cpp
@@ -0,0 +1,35 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 not %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+int shared_var = 0;
+
+void *thread_func(void *arg) {
+  for (int i = 0; i < 10; i++) {
+    shared_var++; // RACE: no synchronization
+  }
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  shared_var = 0;
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+
+  // The value might be wrong due to race, but that's not the point
+  // We're testing that TSAN detects the race
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: WARNING: ThreadSanitizer: data race
+// CHECK: ThreadSanitizer: data race detected at iteration
+// CHECK: ThreadSanitizer: simulation stopped due to race detection
diff --git a/compiler-rt/test/tsan/simulate_rare_race.cpp b/compiler-rt/test/tsan/simulate_rare_race.cpp
new file mode 100644
index 0000000000000..920aaf3fa1a94
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_rare_race.cpp
@@ -0,0 +1,77 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=1000 not %run %t 2>&1 | FileCheck %s
+
+// Test based on rare_ref.cpp from https://github.com/NVIDIA/stdexec/pull/1395
+// A race condition involving reference counting where two threads both access
+// a non-atomic variable after decrementing the reference count.
+// Standard TSAN rarely detects this race; simulation finds it quickly.
+
+#include <atomic>
+#include <condition_variable>
+#include <mutex>
+#include <pthread.h>
+#include <vector>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+struct TestData {
+  std::mutex mtx;
+  std::condition_variable cv;
+  int x = 0;
+  std::atomic<int> ref{2};
+  std::atomic<int> *value = new std::atomic<int>{0};
+  int non_atomic = 0; // Race target
+};
+
+static void *thread1_func(void *arg) {
+  TestData *data = (TestData *)arg;
+
+  {
+    std::unique_lock<std::mutex> lg(data->mtx);
+    data->x = 1;
+    data->cv.notify_one();
+  }
+
+  int new_ref_count = data->ref.fetch_sub(1) - 1;
+  if (new_ref_count == 0) {
+    delete data->value;
+  }
+
+  data->non_atomic += 1; // Race here
+  return nullptr;
+}
+
+static void *thread2_func(void *arg) {
+  TestData *data = (TestData *)arg;
+
+  {
+    std::unique_lock<std::mutex> lg(data->mtx);
+    data->cv.wait(lg, [&] { return data->x != 0; });
+  }
+
+  int new_ref_count = data->ref.fetch_sub(1) - 1;
+  if (new_ref_count == 1) {
+    data->non_atomic += 1; // Race here
+  }
+
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  TestData data;
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread1_func, &data);
+  pthread_create(&t2, nullptr, thread2_func, &data);
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+
+  delete data.value;
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: WARNING: ThreadSanitizer: data race
+// CHECK: Write of size 4
+// CHECK: Previous write of size 4
diff --git a/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp b/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
new file mode 100644
index 0000000000000..49066c18b9d1a
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
@@ -0,0 +1,28 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 %run %t 2>&1 | FileCheck %s
+
+#include <atomic>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+void *thread_func(void *arg) { return nullptr; }
+
+void test_callback(void *arg) {
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+  pthread_join(t1, nullptr);
+
+  // Verify simulation scheduling between joins does allow two threads to run
+  // in parallel (checked by internal assertions). Only one thread can ever
+  // run at the same time in the simulation scheduler.
+  std::atomic<int> a{};
+  ++a;
+
+  pthread_join(t2, nullptr);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
diff --git a/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp b/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
new file mode 100644
index 0000000000000..abb35758036cc
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+#include <stdlib.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+pthread_rwlock_t rwlock;
+
+void *thread_func(void *arg) {
+  pthread_rwlock_rdlock(&rwlock);
+  pthread_rwlock_unlock(&rwlock);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_rwlock_init(&rwlock, nullptr);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_rwlock_destroy(&rwlock);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: pthread_rwlock_rdlock
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations
diff --git a/compiler-rt/test/tsan/simulate_sleep.cpp b/compiler-rt/test/tsan/simulate_sleep.cpp
new file mode 100644
index 0000000000000..9598003619e62
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_sleep.cpp
@@ -0,0 +1,24 @@
+// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" not %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+void *thread_func(void *arg) {
+  usleep(1000);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: usleep
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations
diff --git a/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp b/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp
new file mode 100644
index 0000000000000..9598003619e62
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp
@@ -0,0 +1,24 @@
+// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" not %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+void *thread_func(void *arg) {
+  usleep(1000);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: usleep
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations
diff --git a/compiler-rt/test/tsan/simulate_spinlock.cpp b/compiler-rt/test/tsan/simulate_spinlock.cpp
new file mode 100644
index 0000000000000..77072385355cd
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_spinlock.cpp
@@ -0,0 +1,34 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+//
+// pthread_spin_* functions are not available on Apple
+// UNSUPPORTED: darwin
+
+#include <pthread.h>
+#include <stdlib.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+pthread_spinlock_t spinlock;
+
+void *thread_func(void *arg) {
+  pthread_spin_lock(&spinlock);
+  pthread_spin_unlock(&spinlock);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_spin_init(&spinlock, PTHREAD_PROCESS_PRIVATE);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_spin_destroy(&spinlock);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: pthread_spin_lock
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: unsupported interceptor at iteration 0
diff --git a/compiler-rt/test/tsan/simulate_start_iteration.cpp b/compiler-rt/test/tsan/simulate_start_iteration.cpp
new file mode 100644
index 0000000000000..3caab46181077
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_start_iteration.cpp
@@ -0,0 +1,42 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_start_iteration=5:simulate_iterations=1 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-ITER5
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_start_iteration=42:simulate_iterations=3 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-ITER42
+
+#include <pthread.h>
+#include <stdio.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+int counter = 0;
+
+void *thread_func(void *arg) {
+  pthread_mutex_lock(&mutex);
+  counter++;
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  fprintf(stderr, "test_callback running\n");
+  counter = 0;
+  pthread_mutex_init(&mutex, nullptr);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_mutex_destroy(&mutex);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK-ITER5: ThreadSanitizer: simulation starting (iterations 5..5
+// CHECK-ITER5: test_callback running
+// CHECK-ITER5-NOT: test_callback running
+
+// CHECK-ITER42: ThreadSanitizer: simulation starting (iterations 42..44
+// CHECK-ITER42: test_callback running
+// CHECK-ITER42: test_callback running
+// CHECK-ITER42: test_callback running
+// CHECK-ITER42-NOT: test_callback running
diff --git a/compiler-rt/test/tsan/simulate_stress_condvar.cpp b/compiler-rt/test/tsan/simulate_stress_condvar.cpp
new file mode 100644
index 0000000000000..b75f5c6e73b4b
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_stress_condvar.cpp
@@ -0,0 +1,56 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=5 %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+pthread_cond_t condvar;
+int ready = 0;
+int workers_done = 0;
+
+void *worker_thread(void *arg) {
+  pthread_mutex_lock(&mutex);
+
+  while (!ready) {
+    pthread_cond_wait(&condvar, &mutex);
+  }
+
+  workers_done++;
+  pthread_mutex_unlock(&mutex);
+
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  ready = 0;
+  workers_done = 0;
+  pthread_mutex_init(&mutex, nullptr);
+  pthread_cond_init(&condvar, nullptr);
+
+  const int num_workers = 3;
+  pthread_t threads[num_workers];
+
+  for (int i = 0; i < num_workers; i++)
+    pthread_create(&threads[i], nullptr, worker_thread, nullptr);
+
+  pthread_mutex_lock(&mutex);
+  ready = 1;
+  pthread_cond_broadcast(&condvar);
+  pthread_mutex_unlock(&mutex);
+
+  for (int i = 0; i < num_workers; i++)
+    pthread_join(threads[i], nullptr);
+
+  pthread_cond_destroy(&condvar);
+  pthread_mutex_destroy(&mutex);
+
+  assert(workers_done == num_workers);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_stress_mutex.cpp b/compiler-rt/test/tsan/simulate_stress_mutex.cpp
new file mode 100644
index 0000000000000..b5332e5902993
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_stress_mutex.cpp
@@ -0,0 +1,44 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=20 %run %t 2>&1 | FileCheck %s
+
+#include <assert.h>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+pthread_mutex_t mutex;
+int counter = 0;
+
+void *thread_func(void *arg) {
+  for (int i = 0; i < 50; i++) {
+    pthread_mutex_lock(&mutex);
+    counter++;
+    pthread_mutex_unlock(&mutex);
+  }
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  counter = 0;
+  pthread_mutex_init(&mutex, nullptr);
+
+  const int num_threads = 8;
+  pthread_t threads[num_threads];
+
+  for (int i = 0; i < num_threads; i++) {
+    pthread_create(&threads[i], nullptr, thread_func, nullptr);
+  }
+
+  for (int i = 0; i < num_threads; i++) {
+    pthread_join(threads[i], nullptr);
+  }
+
+  pthread_mutex_destroy(&mutex);
+
+  assert(counter == num_threads * 50);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: ThreadSanitizer: simulation finished
diff --git a/compiler-rt/test/tsan/simulate_thread_detection.cpp b/compiler-rt/test/tsan/simulate_thread_detection.cpp
new file mode 100644
index 0000000000000..bad65b032a039
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_thread_detection.cpp
@@ -0,0 +1,43 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=simulate_scheduler=random:simulate_iterations=2 %run %t 2>&1 | FileCheck %s
+
+#include "test.h"
+#include <assert.h>
+#include <atomic>
+#include <stdio.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+std::atomic<bool> keep_running(true);
+
+void *background_thread(void *arg) {
+  while (keep_running.load(std::memory_order_relaxed)) {
+    usleep(10000);
+  }
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  long test_case = (long)arg;
+  fprintf(stderr, "test_callback test_case=%ld\n", test_case);
+}
+
+int main() {
+  pthread_t bg;
+  pthread_create(&bg, nullptr, background_thread, nullptr);
+
+  assert(__tsan_simulate(test_callback, (void *)1) != 0);
+
+  keep_running.store(false, std::memory_order_relaxed);
+  pthread_join(bg, nullptr);
+
+  assert(__tsan_simulate(test_callback, (void *)2) == 0);
+  return 0;
+}
+
+// CHECK: ThreadSanitizer: simulation cannot start - other threads are running
+// CHECK: Simulation requires that only the calling thread exists
+// CHECK-NOT: test_callback test_case=1
+// CHECK: ThreadSanitizer: simulation starting (iterations 0..1
+// CHECK: test_callback test_case=2
+// CHECK: ThreadSanitizer: simulation exiting - no threads were spawned
diff --git a/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp b/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
new file mode 100644
index 0000000000000..f955c91f3c4e1
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
@@ -0,0 +1,47 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=5 %run %t 2>&1 | FileCheck %s
+//
+// Test thread_local object destruction.
+// Verifies that thread_local destructors are called and can safely decrement atomics.
+
+#include <assert.h>
+#include <atomic>
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+
+std::atomic<int> ctor_count(0);
+std::atomic<int> dtor_count(0);
+
+class ThreadLocalObject {
+public:
+  ThreadLocalObject() { ctor_count.fetch_add(1, std::memory_order_relaxed); }
+
+  ~ThreadLocalObject() { dtor_count.fetch_add(1, std::memory_order_relaxed); }
+};
+
+void *thread_func(void *arg) {
+  // Access thread_local variable to trigger construction
+  thread_local ThreadLocalObject obj;
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  ctor_count.store(0, std::memory_order_relaxed);
+  dtor_count.store(0, std::memory_order_relaxed);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  assert(ctor_count.load(std::memory_order_relaxed) == 3);
+  assert(dtor_count.load(std::memory_order_relaxed) == 3);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation starting
diff --git a/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp b/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
new file mode 100644
index 0000000000000..83dd939bf75f8
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
@@ -0,0 +1,40 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+//
+// pthread_mutex_timedlock is not available on Apple
+// UNSUPPORTED: darwin
+
+#include <pthread.h>
+#include <stdlib.h>
+#include <time.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+pthread_mutex_t mutex;
+
+void *thread_func(void *arg) {
+  // This should trigger the unsupported interceptor error
+  struct timespec ts;
+  clock_gettime(CLOCK_REALTIME, &ts);
+  ts.tv_sec += 1; // 1 second timeout
+
+  pthread_mutex_timedlock(&mutex, &ts);
+  pthread_mutex_unlock(&mutex);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_mutex_init(&mutex, nullptr);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_mutex_destroy(&mutex);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: pthread_mutex_timedlock
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations
diff --git a/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp b/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp
new file mode 100644
index 0000000000000..aa18b73d1ec7c
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp
@@ -0,0 +1,32 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+
+#include <pthread.h>
+#include <stdlib.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+
+pthread_rwlock_t rwlock;
+
+void *thread_func(void *arg) {
+  // This should trigger the unsupported interceptor error
+  pthread_rwlock_rdlock(&rwlock);
+  pthread_rwlock_unlock(&rwlock);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_rwlock_init(&rwlock, nullptr);
+
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+
+  pthread_rwlock_destroy(&rwlock);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: pthread_rwlock_rdlock
+// CHECK: Simulation does not support this synchronization primitive
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations

>From 1b18793d55647f34ada54281be59e49ef90e19e6 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Wed, 25 Feb 2026 02:27:06 +0000
Subject: [PATCH 02/13] Improve tests

---
 clang/test/Driver/fsanitize.c                | 13 ++++++++
 compiler-rt/test/tsan/simulate_rare_race.cpp | 27 ++++++++--------
 compiler-rt/test/tsan/simulate_wrap_main.cpp | 34 ++++++++++++++++++++
 3 files changed, 61 insertions(+), 13 deletions(-)
 create mode 100644 compiler-rt/test/tsan/simulate_wrap_main.cpp

diff --git a/clang/test/Driver/fsanitize.c b/clang/test/Driver/fsanitize.c
index f6a82d899d5bf..eb11d30f234be 100644
--- a/clang/test/Driver/fsanitize.c
+++ b/clang/test/Driver/fsanitize.c
@@ -321,6 +321,19 @@
 // RUN: %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-atomics -fno-sanitize-thread-atomics %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-ATOMICS-BOTH-OFF
 // CHECK-TSAN-ATOMICS-BOTH-OFF: -cc1{{.*}}tsan-instrument-atomics=0
 
+// RUN: %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN
+// CHECK-TSAN-SIMULATE-MAIN-NOT: error:
+// CHECK-TSAN-SIMULATE-MAIN-NOT: unsupported option
+
+// RUN: not %clang --target=x86_64-apple-darwin -fsanitize=thread -fsanitize-thread-simulate-main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-DARWIN
+// CHECK-TSAN-SIMULATE-MAIN-DARWIN: error: unsupported option '-fsanitize-thread-simulate-main' for target 'x86_64-apple-darwin'
+
+// RUN: not %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main -Wl,--wrap=main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-WRAP
+// CHECK-TSAN-SIMULATE-MAIN-WRAP: error: invalid argument '-fsanitize-thread-simulate-main' not allowed with '-Wl,--wrap=main'
+
+// RUN: not %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main -Xlinker --wrap=main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-XLINKER
+// CHECK-TSAN-SIMULATE-MAIN-XLINKER: error: invalid argument '-fsanitize-thread-simulate-main' not allowed with '-Xlinker --wrap=main'
+
 // RUN: not %clang --target=x86_64-apple-darwin10 -mmacos-version-min=10.8 -fsanitize=vptr %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-VPTR-DARWIN-OLD
 // CHECK-VPTR-DARWIN-OLD: unsupported option '-fsanitize=vptr' for target 'x86_64-apple-darwin10'
 
diff --git a/compiler-rt/test/tsan/simulate_rare_race.cpp b/compiler-rt/test/tsan/simulate_rare_race.cpp
index 920aaf3fa1a94..dbf11d2bf4de3 100644
--- a/compiler-rt/test/tsan/simulate_rare_race.cpp
+++ b/compiler-rt/test/tsan/simulate_rare_race.cpp
@@ -7,16 +7,13 @@
 // Standard TSAN rarely detects this race; simulation finds it quickly.
 
 #include <atomic>
-#include <condition_variable>
-#include <mutex>
 #include <pthread.h>
-#include <vector>
 
 extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
 
 struct TestData {
-  std::mutex mtx;
-  std::condition_variable cv;
+  pthread_mutex_t mtx;
+  pthread_cond_t cv;
   int x = 0;
   std::atomic<int> ref{2};
   std::atomic<int> *value = new std::atomic<int>{0};
@@ -26,11 +23,10 @@ struct TestData {
 static void *thread1_func(void *arg) {
   TestData *data = (TestData *)arg;
 
-  {
-    std::unique_lock<std::mutex> lg(data->mtx);
-    data->x = 1;
-    data->cv.notify_one();
-  }
+  pthread_mutex_lock(&data->mtx);
+  data->x = 1;
+  pthread_cond_signal(&data->cv);
+  pthread_mutex_unlock(&data->mtx);
 
   int new_ref_count = data->ref.fetch_sub(1) - 1;
   if (new_ref_count == 0) {
@@ -44,10 +40,11 @@ static void *thread1_func(void *arg) {
 static void *thread2_func(void *arg) {
   TestData *data = (TestData *)arg;
 
-  {
-    std::unique_lock<std::mutex> lg(data->mtx);
-    data->cv.wait(lg, [&] { return data->x != 0; });
+  pthread_mutex_lock(&data->mtx);
+  while (data->x == 0) {
+    pthread_cond_wait(&data->cv, &data->mtx);
   }
+  pthread_mutex_unlock(&data->mtx);
 
   int new_ref_count = data->ref.fetch_sub(1) - 1;
   if (new_ref_count == 1) {
@@ -59,6 +56,8 @@ static void *thread2_func(void *arg) {
 
 void test_callback(void *arg) {
   TestData data;
+  pthread_mutex_init(&data.mtx, nullptr);
+  pthread_cond_init(&data.cv, nullptr);
 
   pthread_t t1, t2;
   pthread_create(&t1, nullptr, thread1_func, &data);
@@ -67,6 +66,8 @@ void test_callback(void *arg) {
   pthread_join(t2, nullptr);
 
   delete data.value;
+  pthread_mutex_destroy(&data.mtx);
+  pthread_cond_destroy(&data.cv);
 }
 
 int main() { return __tsan_simulate(test_callback, nullptr); }
diff --git a/compiler-rt/test/tsan/simulate_wrap_main.cpp b/compiler-rt/test/tsan/simulate_wrap_main.cpp
new file mode 100644
index 0000000000000..f3dacbf05eb81
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_wrap_main.cpp
@@ -0,0 +1,34 @@
+// RUN: %clangxx_tsan -O1 %s -fsanitize-thread-simulate-main -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 not %run %t 2>&1 | FileCheck %s
+//
+// REQUIRES: linux
+
+#include <atomic>
+#include <pthread.h>
+
+std::atomic<int> d{};
+int a = 0;
+
+void *thread_func(void *arg) {
+  ++d;
+  ++a; // Data race!
+  ++d;
+  return nullptr;
+}
+
+// Note: NO call to __tsan_simulate() - the -fsanitize-thread-simulate-main
+// flag automatically wraps this main() to run under simulation.
+int main() {
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, nullptr);
+  pthread_join(t1, nullptr);
+  pthread_join(t2, nullptr);
+  return 0;
+}
+
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: WARNING: ThreadSanitizer: data race
+// CHECK: Write of size 4
+// CHECK: Previous write of size 4
+// CHECK: ThreadSanitizer: data race detected at iteration

>From cc4ac13b349bac88f3bddd2268c2c43e3cdcc729 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Fri, 12 Jun 2026 23:53:41 -0400
Subject: [PATCH 03/13] Docs Feedback, remove wrap main feature

---
 clang/docs/ThreadSanitizer.rst               | 33 ++++++++++---------
 clang/include/clang/Driver/SanitizerArgs.h   |  2 --
 clang/include/clang/Options/Options.td       |  4 ---
 clang/lib/Driver/SanitizerArgs.cpp           | 27 ----------------
 clang/lib/Driver/ToolChains/Gnu.cpp          |  8 -----
 clang/test/Driver/fsanitize.c                | 13 --------
 compiler-rt/lib/tsan/rtl/tsan_interface.cpp  | 33 -------------------
 compiler-rt/test/tsan/simulate_wrap_main.cpp | 34 --------------------
 8 files changed, 17 insertions(+), 137 deletions(-)
 delete mode 100644 compiler-rt/test/tsan/simulate_wrap_main.cpp

diff --git a/clang/docs/ThreadSanitizer.rst b/clang/docs/ThreadSanitizer.rst
index a010a8d8063df..9f45dec44e5e7 100644
--- a/clang/docs/ThreadSanitizer.rst
+++ b/clang/docs/ThreadSanitizer.rst
@@ -334,7 +334,7 @@ Overview
 ~~~~~~~~
 
 The Simulation Scheduler is an optional ThreadSanitizer feature that enables
-systematic exploration of thread interleavings to expose data races that may be
+random exploration of thread interleavings to expose data races that may be
 difficult to trigger in normal execution. Unlike standard ThreadSanitizer which
 detects races as they occur naturally during program execution, the simulation
 scheduler takes control of thread scheduling to deliberately explore different
@@ -343,8 +343,9 @@ execution orderings.
 Simulation is particularly useful for:
 
 * Testing concurrent data structure or algorithms during development to ensure
-  correctness (for example, a lock free queue).
-* Finding races in rarely-executed interleavings that standard TSAN may miss
+  correctness (for example, a lock-free queue).
+* Finding races in rarely-executed interleavings that standard ThreadSanitizer
+  may miss
 * Reproducing specific race conditions deterministically
 
 Simulation is not useful for running full applications, and will likely not
@@ -390,16 +391,15 @@ Then compile with ThreadSanitizer and enable the simulation scheduler:
   $ TSAN_OPTIONS=simulate_scheduler=random ./a.out
   ThreadSanitizer: simulation starting (iterations 0..999, max_depth=10000, scheduler=random)
 
-Automatic Main Wrapping
-~~~~~~~~~~~~~~~~~~~~~~~~
+Wrapping Main for Simulation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-For convenience, the ``-fsanitize-thread-simulate-main`` compiler flag
-automatically wraps ``main()`` to call ``__tsan_simulate()``, eliminating the
-need to manually modify code:
+To avoid modifying existing test code, you can use the linker's ``--wrap=main``
+option to automatically run ``main()`` under simulation.
 
 .. code-block:: c
 
-    // No need to call __tsan_simulate() manually
+    // mytest.c
     void *thread_func(void *arg) { /* ... */ }
 
     int main() {
@@ -412,17 +412,18 @@ need to manually modify code:
       return 0;
     }
 
-Compile and run:
+Compile and link with ``-Wl,--wrap=main``:
 
 .. code-block:: console
 
-  $ clang -fsanitize=thread -fsanitize-thread-simulate-main -g -O1 mytest.c
+  $ clang -fsanitize=thread -g -O1 mytest.c -Wl,--wrap=main
   $ TSAN_OPTIONS=simulate_scheduler=random ./a.out
   ThreadSanitizer: simulation starting (iterations 0..999, max_depth=10000, scheduler=random)
 
-**Platform Support**: This flag requires GNU ld linker support for ``--wrap=main``
-and is currently only supported on Linux. Do not manually specify ``-Wl,--wrap=main``
-when using this flag, as the compiler handles the wrapping automatically.
+The ``--wrap`` option is supported by GNU ld and lld but is not available on all
+linkers (notably, the macOS linker does not support it). The linker replaces
+calls to ``main`` with ``__wrap_main`` and provides access to the original
+``main`` as ``__real_main``.
 
 Configuration Options
 ~~~~~~~~~~~~~~~~~~~~~
@@ -482,7 +483,7 @@ Configuration Options
 Examples
 ~~~~~~~~
 
-Basic race detection that standard TSAN rarely finds:
+Basic race detection that standard ThreadSanitizer rarely finds:
 
 .. code-block:: c
 
@@ -512,7 +513,7 @@ Basic race detection that standard TSAN rarely finds:
 
     int main() { return __tsan_simulate(test_callback, NULL); }
 
-Standard TSAN execution rarely detects this race. Running 100 times produces no
+Standard ThreadSanitizer execution rarely detects this race. Running 100 times produces no
 output most of the time:
 
 .. code-block:: console
diff --git a/clang/include/clang/Driver/SanitizerArgs.h b/clang/include/clang/Driver/SanitizerArgs.h
index b1b7c2b30f971..d4ee17802fd8e 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -68,7 +68,6 @@ class SanitizerArgs {
   bool TsanMemoryAccess = true;
   bool TsanFuncEntryExit = true;
   bool TsanAtomics = true;
-  bool TsanSimulateMain = false;
   bool MinimalRuntime = false;
   bool TrapLoop = false;
   bool TysanOutlineInstrumentation = true;
@@ -104,7 +103,6 @@ class SanitizerArgs {
   }
   bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
-  bool needsTsanSimulateMain() const { return TsanSimulateMain; }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
   bool needsLsanRt() const {
diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td
index 9567f0e754934..b4447fcc04120 100644
--- a/clang/include/clang/Options/Options.td
+++ b/clang/include/clang/Options/Options.td
@@ -2917,10 +2917,6 @@ def fno_sanitize_thread_atomics : Flag<["-"], "fno-sanitize-thread-atomics">,
                                   Group<f_clang_Group>,
                                   Visibility<[ClangOption, CLOption]>,
                                   HelpText<"Disable atomic operations instrumentation in ThreadSanitizer">;
-def fsanitize_thread_simulate_main
-    : Flag<["-"], "fsanitize-thread-simulate-main">,
-      Group<f_clang_Group>,
-      HelpText<"Wrap main() to run under ThreadSanitizer simulation mode">;
 def fsanitize_undefined_strip_path_components_EQ : Joined<["-"], "fsanitize-undefined-strip-path-components=">,
   Group<f_clang_Group>, MetaVarName<"<number>">,
   HelpText<"Strip (or keep only, if negative) a given number of path components "
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index 9b4a594fd4f48..74ebd0bf375d3 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -927,33 +927,6 @@ SanitizerArgs::SanitizerArgs(const ToolChain &TC,
     TsanAtomics =
         Args.hasFlag(options::OPT_fsanitize_thread_atomics,
                      options::OPT_fno_sanitize_thread_atomics, TsanAtomics);
-    TsanSimulateMain = Args.hasArg(options::OPT_fsanitize_thread_simulate_main);
-
-    // -fsanitize-thread-simulate-main requires --wrap=main linker support,
-    // which is only available on Linux with GNU ld.
-    if (TsanSimulateMain && DiagnoseErrors && !TC.getTriple().isOSLinux()) {
-      D.Diag(diag::err_drv_unsupported_opt_for_target)
-          << "-fsanitize-thread-simulate-main" << TC.getTriple().str();
-      TsanSimulateMain = false;
-    }
-
-    // Check for conflicting -Wl,--wrap=main when using
-    // -fsanitize-thread-simulate-main
-    if (TsanSimulateMain && DiagnoseErrors) {
-      for (const Arg *A :
-           Args.filtered(options::OPT_Wl_COMMA, options::OPT_Xlinker)) {
-        for (StringRef Val : A->getValues()) {
-          if (Val == "--wrap=main" || Val == "-wrap=main") {
-            D.Diag(diag::err_drv_argument_not_allowed_with)
-                << "-fsanitize-thread-simulate-main"
-                << (A->getOption().matches(options::OPT_Wl_COMMA)
-                        ? "-Wl,--wrap=main"
-                        : "-Xlinker --wrap=main");
-            break;
-          }
-        }
-      }
-    }
   }
 
   if (AllAddedKinds & SanitizerKind::CFI) {
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp b/clang/lib/Driver/ToolChains/Gnu.cpp
index 4e1b2b39eaf6d..d0579ebdd109b 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -20,7 +20,6 @@
 #include "clang/Driver/Compilation.h"
 #include "clang/Driver/Driver.h"
 #include "clang/Driver/MultilibBuilder.h"
-#include "clang/Driver/SanitizerArgs.h"
 #include "clang/Driver/Tool.h"
 #include "clang/Driver/ToolChain.h"
 #include "clang/Options/Options.h"
@@ -448,13 +447,6 @@ void tools::gnutools::Linker::ConstructJob(Compilation &C, const JobAction &JA,
   bool NeedsSanitizerDeps = addSanitizerRuntimes(ToolChain, Args, CmdArgs);
   bool NeedsXRayDeps = addXRayRuntime(ToolChain, Args, CmdArgs);
 
-  // Add --wrap=main for ThreadSanitizer simulation mode
-  if (NeedsSanitizerDeps) {
-    const SanitizerArgs &SanArgs = ToolChain.getSanitizerArgs(Args);
-    if (SanArgs.needsTsanRt() && SanArgs.needsTsanSimulateMain())
-      CmdArgs.push_back("--wrap=main");
-  }
-
   addLinkerCompressDebugSectionsOption(ToolChain, Args, CmdArgs);
   AddLinkerInputs(ToolChain, Inputs, Args, CmdArgs, JA);
 
diff --git a/clang/test/Driver/fsanitize.c b/clang/test/Driver/fsanitize.c
index eb11d30f234be..f6a82d899d5bf 100644
--- a/clang/test/Driver/fsanitize.c
+++ b/clang/test/Driver/fsanitize.c
@@ -321,19 +321,6 @@
 // RUN: %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-atomics -fno-sanitize-thread-atomics %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-ATOMICS-BOTH-OFF
 // CHECK-TSAN-ATOMICS-BOTH-OFF: -cc1{{.*}}tsan-instrument-atomics=0
 
-// RUN: %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN
-// CHECK-TSAN-SIMULATE-MAIN-NOT: error:
-// CHECK-TSAN-SIMULATE-MAIN-NOT: unsupported option
-
-// RUN: not %clang --target=x86_64-apple-darwin -fsanitize=thread -fsanitize-thread-simulate-main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-DARWIN
-// CHECK-TSAN-SIMULATE-MAIN-DARWIN: error: unsupported option '-fsanitize-thread-simulate-main' for target 'x86_64-apple-darwin'
-
-// RUN: not %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main -Wl,--wrap=main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-WRAP
-// CHECK-TSAN-SIMULATE-MAIN-WRAP: error: invalid argument '-fsanitize-thread-simulate-main' not allowed with '-Wl,--wrap=main'
-
-// RUN: not %clang --target=x86_64-linux-gnu -fsanitize=thread -fsanitize-thread-simulate-main -Xlinker --wrap=main %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-TSAN-SIMULATE-MAIN-XLINKER
-// CHECK-TSAN-SIMULATE-MAIN-XLINKER: error: invalid argument '-fsanitize-thread-simulate-main' not allowed with '-Xlinker --wrap=main'
-
 // RUN: not %clang --target=x86_64-apple-darwin10 -mmacos-version-min=10.8 -fsanitize=vptr %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-VPTR-DARWIN-OLD
 // CHECK-VPTR-DARWIN-OLD: unsupported option '-fsanitize=vptr' for target 'x86_64-apple-darwin10'
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
index d1f1510a34e62..752851192e2e4 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
@@ -15,9 +15,7 @@
 #include "sanitizer_common/sanitizer_internal_defs.h"
 #include "sanitizer_common/sanitizer_ptrauth.h"
 #include "tsan_interface_ann.h"
-#include "tsan_platform.h"
 #include "tsan_rtl.h"
-#include "tsan_shadow.h"
 #include "tsan_simulate.h"
 
 #define CALLERPC ((uptr)__builtin_return_address(0))
@@ -92,37 +90,6 @@ int __tsan_simulate(void (*callback)(void* arg), void* arg) {
   return SimulateRun(callback, arg);
 }
 
-#if SANITIZER_LINUX
-// Support for -fsanitize-thread-simulate-main linker wrapping.
-// The --wrap linker feature is only available on GNU LD (Linux), not on macOS.
-extern "C" SANITIZER_WEAK_ATTRIBUTE int __real_main(int argc, char** argv,
-                                                    char** envp);
-
-namespace {
-struct MainArgs {
-  int argc;
-  char** argv;
-  char** envp;
-  int exit_code;
-};
-
-static void wrapped_main_callback(void* arg) {
-  MainArgs* args = static_cast<MainArgs*>(arg);
-  args->exit_code = __real_main(args->argc, args->argv, args->envp);
-}
-}  // namespace
-
-extern "C" int __wrap_main(int argc, char** argv, char** envp) {
-  MainArgs args = {argc, argv, envp, 0};
-  int sim_result = __tsan_simulate(wrapped_main_callback, &args);
-  // If simulation succeeded (return code 0 or exit due to no threads spawned),
-  // return the exit code from main. Otherwise, return the simulation error
-  // code.
-  if (sim_result == 0)
-    return args.exit_code;
-  return sim_result;
-}
-#endif  // SANITIZER_LINUX
 
 void __tsan_acquire(void *addr) {
   Acquire(cur_thread(), CALLERPC, (uptr)addr);
diff --git a/compiler-rt/test/tsan/simulate_wrap_main.cpp b/compiler-rt/test/tsan/simulate_wrap_main.cpp
deleted file mode 100644
index f3dacbf05eb81..0000000000000
--- a/compiler-rt/test/tsan/simulate_wrap_main.cpp
+++ /dev/null
@@ -1,34 +0,0 @@
-// RUN: %clangxx_tsan -O1 %s -fsanitize-thread-simulate-main -o %t
-// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=50 not %run %t 2>&1 | FileCheck %s
-//
-// REQUIRES: linux
-
-#include <atomic>
-#include <pthread.h>
-
-std::atomic<int> d{};
-int a = 0;
-
-void *thread_func(void *arg) {
-  ++d;
-  ++a; // Data race!
-  ++d;
-  return nullptr;
-}
-
-// Note: NO call to __tsan_simulate() - the -fsanitize-thread-simulate-main
-// flag automatically wraps this main() to run under simulation.
-int main() {
-  pthread_t t1, t2;
-  pthread_create(&t1, nullptr, thread_func, nullptr);
-  pthread_create(&t2, nullptr, thread_func, nullptr);
-  pthread_join(t1, nullptr);
-  pthread_join(t2, nullptr);
-  return 0;
-}
-
-// CHECK: ThreadSanitizer: simulation starting
-// CHECK: WARNING: ThreadSanitizer: data race
-// CHECK: Write of size 4
-// CHECK: Previous write of size 4
-// CHECK: ThreadSanitizer: data race detected at iteration

>From a0e644c59cd63da000b9318b20919dc5b2c2eabc Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sat, 13 Jun 2026 01:44:43 -0400
Subject: [PATCH 04/13] Replace sleep with schedule point

---
 .../lib/tsan/rtl/tsan_interceptors_posix.cpp  | 15 +++++++++---
 compiler-rt/test/tsan/simulate_sleep.cpp      | 19 ++++++++++-----
 .../test/tsan/simulate_sleep_unsupported.cpp  | 24 -------------------
 3 files changed, 25 insertions(+), 33 deletions(-)
 delete mode 100644 compiler-rt/test/tsan/simulate_sleep_unsupported.cpp

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index ba7c04030320f..5a533825bc02d 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -383,7 +383,10 @@ struct BlockingCall {
 
 TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
   SCOPED_TSAN_INTERCEPTOR(sleep, sec);
-  SIMULATE_CHECK_UNSUPPORTED(sleep);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateSchedule();
+    return 0;
+  }
   unsigned res = BLOCK_REAL(sleep)(sec);
   AfterSleep(thr, pc);
   return res;
@@ -391,7 +394,10 @@ TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
 
 TSAN_INTERCEPTOR(int, usleep, long_t usec) {
   SCOPED_TSAN_INTERCEPTOR(usleep, usec);
-  SIMULATE_CHECK_UNSUPPORTED(usleep);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateSchedule();
+    return 0;
+  }
   int res = BLOCK_REAL(usleep)(usec);
   AfterSleep(thr, pc);
   return res;
@@ -399,7 +405,10 @@ TSAN_INTERCEPTOR(int, usleep, long_t usec) {
 
 TSAN_INTERCEPTOR(int, nanosleep, void *req, void *rem) {
   SCOPED_TSAN_INTERCEPTOR(nanosleep, req, rem);
-  SIMULATE_CHECK_UNSUPPORTED(nanosleep);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateSchedule();
+    return 0;
+  }
   int res = BLOCK_REAL(nanosleep)(req, rem);
   AfterSleep(thr, pc);
   return res;
diff --git a/compiler-rt/test/tsan/simulate_sleep.cpp b/compiler-rt/test/tsan/simulate_sleep.cpp
index 9598003619e62..9d89eb36dbb96 100644
--- a/compiler-rt/test/tsan/simulate_sleep.cpp
+++ b/compiler-rt/test/tsan/simulate_sleep.cpp
@@ -1,13 +1,17 @@
-// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" not %run %t 2>&1 | FileCheck %s
+// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" %run %t 2>&1 | FileCheck %s
 
 #include <pthread.h>
-#include <stdlib.h>
+#include <stdio.h>
+#include <time.h>
 #include <unistd.h>
 
 extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
 
 void *thread_func(void *arg) {
+  sleep(1);
   usleep(1000);
+  struct timespec ts = {0, 1000000};
+  nanosleep(&ts, nullptr);
   return nullptr;
 }
 
@@ -17,8 +21,11 @@ void test_callback(void *arg) {
   pthread_join(t, nullptr);
 }
 
-int main() { return __tsan_simulate(test_callback, nullptr); }
+int main() {
+  int res = __tsan_simulate(test_callback, nullptr);
+  fprintf(stderr, "simulation result: %d\n", res);
+  return res;
+}
 
-// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: usleep
-// CHECK: Simulation does not support this synchronization primitive
-// CHECK: ThreadSanitizer: simulation aborted after 1 iterations
+// CHECK: ThreadSanitizer: simulation starting
+// CHECK: simulation result: 0
diff --git a/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp b/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp
deleted file mode 100644
index 9598003619e62..0000000000000
--- a/compiler-rt/test/tsan/simulate_sleep_unsupported.cpp
+++ /dev/null
@@ -1,24 +0,0 @@
-// RUN: %clangxx_tsan -O1 %s -o %t && env TSAN_OPTIONS="simulate_scheduler=random:simulate_iterations=10" not %run %t 2>&1 | FileCheck %s
-
-#include <pthread.h>
-#include <stdlib.h>
-#include <unistd.h>
-
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
-
-void *thread_func(void *arg) {
-  usleep(1000);
-  return nullptr;
-}
-
-void test_callback(void *arg) {
-  pthread_t t;
-  pthread_create(&t, nullptr, thread_func, nullptr);
-  pthread_join(t, nullptr);
-}
-
-int main() { return __tsan_simulate(test_callback, nullptr); }
-
-// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: usleep
-// CHECK: Simulation does not support this synchronization primitive
-// CHECK: ThreadSanitizer: simulation aborted after 1 iterations

>From 31a36369ba65ebf3e48225070b66c6a7a76701cf Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sat, 13 Jun 2026 23:15:25 -0400
Subject: [PATCH 05/13] Remove simulate_schedule_on_memory_access

---
 clang/docs/ThreadSanitizer.rst               |  6 ------
 compiler-rt/lib/tsan/rtl/tsan_flags.inc      |  3 ---
 compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp | 13 -------------
 3 files changed, 22 deletions(-)

diff --git a/clang/docs/ThreadSanitizer.rst b/clang/docs/ThreadSanitizer.rst
index 9f45dec44e5e7..84bad01753a08 100644
--- a/clang/docs/ThreadSanitizer.rst
+++ b/clang/docs/ThreadSanitizer.rst
@@ -468,12 +468,6 @@ Configuration Options
        point. Lower values (e.g., 0) disable context switching, allowing threads
        to run more sequentially. Useful for comparing simulation results against
        sequential execution.
-   * - ``simulate_schedule_on_memory_access``
-     - bool
-     - false
-     - Insert scheduling points at every memory read/write during simulation for
-       maximum interleaving exploration. This can significantly increase overhead
-       but may expose additional races.
    * - ``simulate_print_schedule_stacks``
      - bool
      - false
diff --git a/compiler-rt/lib/tsan/rtl/tsan_flags.inc b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
index a250ca1bf21ff..699471371dc03 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_flags.inc
+++ b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
@@ -131,9 +131,6 @@ TSAN_FLAG(int, simulate_start_iteration, 0,
 TSAN_FLAG(int, simulate_max_depth, 10000,
           "Maximum scheduling depth per iteration. If exceeded, the "
           "simulation returns an error after the iteration completes")
-TSAN_FLAG(bool, simulate_schedule_on_memory_access, false,
-          "Insert scheduling points at every memory read/write during "
-          "simulation for maximum interleaving exploration.")
 TSAN_FLAG(int, simulate_schedule_probability, 100,
           "Probability (0-100%) of actually performing a context switch at "
           "each scheduling point. Lower values allow threads to complete more "
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
index a187c81a23570..b2e70475e0b73 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
@@ -12,7 +12,6 @@
 //===----------------------------------------------------------------------===//
 
 #include "tsan_rtl.h"
-#include "tsan_simulate.h"
 
 namespace __tsan {
 
@@ -424,10 +423,6 @@ ALWAYS_INLINE USED void MemoryAccess(ThreadState* thr, uptr pc, uptr addr,
   // Swift symbolizer can be intercepted and deadlock without this
   if (thr->in_symbolizer)
     return;
-#endif
-#if !SANITIZER_GO
-  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
-    SimulateSchedule();
 #endif
   RawShadow* shadow_mem = MemToShadow(addr);
   UNUSED char memBuf[4][64];
@@ -467,10 +462,6 @@ ALWAYS_INLINE USED void MemoryAccess16(ThreadState* thr, uptr pc, uptr addr,
   FastState fast_state = thr->fast_state;
   if (UNLIKELY(fast_state.GetIgnoreBit()))
     return;
-#if !SANITIZER_GO
-  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
-    SimulateSchedule();
-#endif
   Shadow cur(fast_state, 0, 8, typ);
   RawShadow* shadow_mem = MemToShadow(addr);
   bool traced = false;
@@ -508,10 +499,6 @@ ALWAYS_INLINE USED void UnalignedMemoryAccess(ThreadState* thr, uptr pc,
   FastState fast_state = thr->fast_state;
   if (UNLIKELY(fast_state.GetIgnoreBit()))
     return;
-#if !SANITIZER_GO
-  if (SimulateIsActive() && flags()->simulate_schedule_on_memory_access)
-    SimulateSchedule();
-#endif
   RawShadow* shadow_mem = MemToShadow(addr);
   bool traced = false;
   uptr size1 = Min<uptr>(size, RoundUp(addr + 1, kShadowCell) - addr);

>From 31fd0a0a74d0b5064c9b7422e4181c2af85cb2ba Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sat, 13 Jun 2026 23:39:58 -0400
Subject: [PATCH 06/13] format

---
 compiler-rt/lib/tsan/rtl/tsan_interface.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
index 752851192e2e4..f91e4873e7b6e 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
@@ -90,7 +90,6 @@ int __tsan_simulate(void (*callback)(void* arg), void* arg) {
   return SimulateRun(callback, arg);
 }
 
-
 void __tsan_acquire(void *addr) {
   Acquire(cur_thread(), CALLERPC, (uptr)addr);
 }

>From 83461046c51c20f2ad25fe722b020a6c7c25a3a9 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sun, 14 Jun 2026 01:09:44 -0400
Subject: [PATCH 07/13] Mark annotated mutexes as unsupported

---
 .../lib/tsan/rtl/tsan_interface_ann.cpp       | 17 +++++++++
 .../tsan/simulate_tsan_mutex_annotations.cpp  | 37 +++++++++++++++++++
 2 files changed, 54 insertions(+)
 create mode 100644 compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
index 490f6cb3bce01..4bc27488d20e7 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
@@ -22,6 +22,7 @@
 #include "tsan_platform.h"
 #include "tsan_report.h"
 #include "tsan_rtl.h"
+#include "tsan_simulate.h"
 
 #define CALLERPC ((uptr)__builtin_return_address(0))
 
@@ -364,6 +365,10 @@ void __tsan_mutex_destroy(void *m, unsigned flagz) {
 INTERFACE_ATTRIBUTE
 void __tsan_mutex_pre_lock(void *m, unsigned flagz) {
   SCOPED_ANNOTATION(__tsan_mutex_pre_lock);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateReportUnsupported("__tsan_mutex_pre_lock");
+    return;
+  }
   if (!(flagz & MutexFlagTryLock)) {
     if (flagz & MutexFlagReadLock)
       MutexPreReadLock(thr, pc, (uptr)m);
@@ -378,6 +383,10 @@ void __tsan_mutex_pre_lock(void *m, unsigned flagz) {
 INTERFACE_ATTRIBUTE
 void __tsan_mutex_post_lock(void *m, unsigned flagz, int rec) {
   SCOPED_ANNOTATION(__tsan_mutex_post_lock);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateReportUnsupported("__tsan_mutex_post_lock");
+    return;
+  }
   ThreadIgnoreSyncEnd(thr);
   ThreadIgnoreEnd(thr);
   if (!(flagz & MutexFlagTryLockFailed)) {
@@ -391,6 +400,10 @@ void __tsan_mutex_post_lock(void *m, unsigned flagz, int rec) {
 INTERFACE_ATTRIBUTE
 int __tsan_mutex_pre_unlock(void *m, unsigned flagz) {
   SCOPED_ANNOTATION_RET(__tsan_mutex_pre_unlock, 0);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateReportUnsupported("__tsan_mutex_pre_unlock");
+    return 0;
+  }
   int ret = 0;
   if (flagz & MutexFlagReadLock) {
     CHECK(!(flagz & MutexFlagRecursiveUnlock));
@@ -407,6 +420,10 @@ INTERFACE_ATTRIBUTE
 void __tsan_mutex_post_unlock(void *m, unsigned flagz) {
   AdaptiveDelay::SyncOp();
   SCOPED_ANNOTATION(__tsan_mutex_post_unlock);
+  if (UNLIKELY(SimulateIsActive())) {
+    SimulateReportUnsupported("__tsan_mutex_post_unlock");
+    return;
+  }
   ThreadIgnoreSyncEnd(thr);
   ThreadIgnoreEnd(thr);
 }
diff --git a/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp b/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp
new file mode 100644
index 0000000000000..bf1f9da9da61b
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp
@@ -0,0 +1,37 @@
+// RUN: %clangxx_tsan %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=2 not %run %t 2>&1 | FileCheck %s
+
+// Mutexes not managed by ThreadSanitizer's runtime (e.g. absl::Mutex) cannot
+// be simulated correctly and may lead to deadlock during simulation.
+
+#include <pthread.h>
+
+extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+extern "C" void __tsan_mutex_pre_lock(void *m, unsigned flagz);
+extern "C" void __tsan_mutex_post_lock(void *m, unsigned flagz, int rec);
+extern "C" int __tsan_mutex_pre_unlock(void *m, unsigned flagz);
+extern "C" void __tsan_mutex_post_unlock(void *m, unsigned flagz);
+
+int fake_mutex;
+
+void *thread_func(void *arg) {
+  __tsan_mutex_pre_lock(&fake_mutex, 0);
+  __tsan_mutex_post_lock(&fake_mutex, 0, 0);
+  __tsan_mutex_pre_unlock(&fake_mutex, 0);
+  __tsan_mutex_post_unlock(&fake_mutex, 0);
+  return nullptr;
+}
+
+void test_callback(void *arg) {
+  pthread_t t;
+  pthread_create(&t, nullptr, thread_func, nullptr);
+  pthread_join(t, nullptr);
+}
+
+int main() { return __tsan_simulate(test_callback, nullptr); }
+
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: __tsan_mutex_pre_lock
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: __tsan_mutex_post_lock
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: __tsan_mutex_pre_unlock
+// CHECK: ThreadSanitizer: simulation error - unsupported interceptor called: __tsan_mutex_post_unlock
+// CHECK: ThreadSanitizer: simulation aborted after 1 iterations

>From ffcbfd6175b76fa2a59966ad025db2e8bc4e36c8 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sun, 14 Jun 2026 01:30:14 -0400
Subject: [PATCH 08/13] Add __tsan_simulate to public header

---
 compiler-rt/include/sanitizer/tsan_interface.h             | 7 +++++++
 compiler-rt/test/tsan/simulate_cond_signal.cpp             | 2 +-
 compiler-rt/test/tsan/simulate_deadlock_condvar.cpp        | 2 +-
 .../test/tsan/simulate_deadlock_missing_broadcast.cpp      | 2 +-
 compiler-rt/test/tsan/simulate_deadlock_simple.cpp         | 2 +-
 compiler-rt/test/tsan/simulate_double_join.cpp             | 2 +-
 compiler-rt/test/tsan/simulate_empty_test.cpp              | 2 +-
 compiler-rt/test/tsan/simulate_immediate_exit.cpp          | 2 +-
 compiler-rt/test/tsan/simulate_invalid_iterations.cpp      | 2 +-
 compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp | 2 +-
 compiler-rt/test/tsan/simulate_iterations.cpp              | 2 +-
 compiler-rt/test/tsan/simulate_join_many_threads.cpp       | 2 +-
 compiler-rt/test/tsan/simulate_max_depth_hit.cpp           | 2 +-
 compiler-rt/test/tsan/simulate_multiple_mutexes.cpp        | 2 +-
 compiler-rt/test/tsan/simulate_mutex_contention.cpp        | 2 +-
 compiler-rt/test/tsan/simulate_nested_create.cpp           | 2 +-
 .../tsan/simulate_non_atomic_interleaved_rare_race.cpp     | 2 +-
 compiler-rt/test/tsan/simulate_probability.cpp             | 2 +-
 compiler-rt/test/tsan/simulate_race_basic.cpp              | 2 +-
 compiler-rt/test/tsan/simulate_rare_race.cpp               | 2 +-
 compiler-rt/test/tsan/simulate_schedule_between_joins.cpp  | 2 +-
 .../test/tsan/simulate_shared_mutex_unsupported.cpp        | 2 +-
 compiler-rt/test/tsan/simulate_sleep.cpp                   | 2 +-
 compiler-rt/test/tsan/simulate_spinlock.cpp                | 2 +-
 compiler-rt/test/tsan/simulate_start_iteration.cpp         | 2 +-
 compiler-rt/test/tsan/simulate_stress_condvar.cpp          | 2 +-
 compiler-rt/test/tsan/simulate_stress_mutex.cpp            | 2 +-
 compiler-rt/test/tsan/simulate_thread_detection.cpp        | 2 +-
 compiler-rt/test/tsan/simulate_thread_local_dtor.cpp       | 2 +-
 compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp | 2 +-
 compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp  | 6 +-----
 compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp | 2 +-
 32 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/compiler-rt/include/sanitizer/tsan_interface.h b/compiler-rt/include/sanitizer/tsan_interface.h
index e11a4175cd8ed..73d1e9d4581f2 100644
--- a/compiler-rt/include/sanitizer/tsan_interface.h
+++ b/compiler-rt/include/sanitizer/tsan_interface.h
@@ -168,6 +168,13 @@ void SANITIZER_CDECL __tsan_set_fiber_name(void *fiber, const char *name);
 // Do not establish a happens-before relation between fibers
 static const unsigned __tsan_switch_to_fiber_no_sync = 1 << 0;
 
+// Simulation scheduler API.
+// Runs a callback repeatedly under a controlled thread scheduler that explores
+// different interleavings to expose data races and deadlocks.
+// No other threads must be running when __tsan_simulate is called.
+// Returns 0 on success, -1 on failure (race detected, deadlock, etc.).
+int SANITIZER_CDECL __tsan_simulate(void (*callback)(void *arg), void *arg);
+
 // User-provided callback invoked on TSan initialization.
 void SANITIZER_CDECL __tsan_on_initialize();
 
diff --git a/compiler-rt/test/tsan/simulate_cond_signal.cpp b/compiler-rt/test/tsan/simulate_cond_signal.cpp
index 1994e64056e40..5c0d3c253e447 100644
--- a/compiler-rt/test/tsan/simulate_cond_signal.cpp
+++ b/compiler-rt/test/tsan/simulate_cond_signal.cpp
@@ -4,7 +4,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 pthread_cond_t cond;
diff --git a/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp b/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
index c82cdf0f20f9b..cecd93d3fb154 100644
--- a/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
+++ b/compiler-rt/test/tsan/simulate_deadlock_condvar.cpp
@@ -5,7 +5,7 @@
 #include <pthread.h>
 #include <unistd.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 pthread_cond_t condvar;
diff --git a/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp b/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
index cc2e66ad4c4ba..21545ee10c283 100644
--- a/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
+++ b/compiler-rt/test/tsan/simulate_deadlock_missing_broadcast.cpp
@@ -9,7 +9,7 @@
 #include <pthread.h>
 #include <unistd.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 pthread_cond_t cond;
diff --git a/compiler-rt/test/tsan/simulate_deadlock_simple.cpp b/compiler-rt/test/tsan/simulate_deadlock_simple.cpp
index b1010ff09483d..fb5323d8b5b3b 100644
--- a/compiler-rt/test/tsan/simulate_deadlock_simple.cpp
+++ b/compiler-rt/test/tsan/simulate_deadlock_simple.cpp
@@ -9,7 +9,7 @@
 #include <pthread.h>
 #include <unistd.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex_a;
 pthread_mutex_t mutex_b;
diff --git a/compiler-rt/test/tsan/simulate_double_join.cpp b/compiler-rt/test/tsan/simulate_double_join.cpp
index 75fc42392bb10..cc83081462ec1 100644
--- a/compiler-rt/test/tsan/simulate_double_join.cpp
+++ b/compiler-rt/test/tsan/simulate_double_join.cpp
@@ -4,7 +4,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void *thread_func(void *arg) { return nullptr; }
 
diff --git a/compiler-rt/test/tsan/simulate_empty_test.cpp b/compiler-rt/test/tsan/simulate_empty_test.cpp
index f4a86855618d8..ce0739a14a010 100644
--- a/compiler-rt/test/tsan/simulate_empty_test.cpp
+++ b/compiler-rt/test/tsan/simulate_empty_test.cpp
@@ -5,7 +5,7 @@
 #include <pthread.h>
 #include <stdio.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 static int called;
 void test_callback(void *arg) {
diff --git a/compiler-rt/test/tsan/simulate_immediate_exit.cpp b/compiler-rt/test/tsan/simulate_immediate_exit.cpp
index 76c319b03b9f0..35deb2fd99fc0 100644
--- a/compiler-rt/test/tsan/simulate_immediate_exit.cpp
+++ b/compiler-rt/test/tsan/simulate_immediate_exit.cpp
@@ -4,7 +4,7 @@
 #include <pthread.h>
 #include <stdio.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void *thread_func(void *arg) { return nullptr; }
 
diff --git a/compiler-rt/test/tsan/simulate_invalid_iterations.cpp b/compiler-rt/test/tsan/simulate_invalid_iterations.cpp
index 77ecdfafae6da..91a065d81a687 100644
--- a/compiler-rt/test/tsan/simulate_invalid_iterations.cpp
+++ b/compiler-rt/test/tsan/simulate_invalid_iterations.cpp
@@ -5,7 +5,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void test_callback(void *arg) { assert(0); }
 
diff --git a/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp b/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
index 89019e6a94606..3921cf2f1065d 100644
--- a/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
+++ b/compiler-rt/test/tsan/simulate_invalid_start_iteration.cpp
@@ -5,7 +5,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void test_callback(void *arg) { assert(0); }
 
diff --git a/compiler-rt/test/tsan/simulate_iterations.cpp b/compiler-rt/test/tsan/simulate_iterations.cpp
index 82fdcc780b204..dd550686cfecd 100644
--- a/compiler-rt/test/tsan/simulate_iterations.cpp
+++ b/compiler-rt/test/tsan/simulate_iterations.cpp
@@ -7,7 +7,7 @@
 #include <pthread.h>
 #include <stdio.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 int counter = 0;
diff --git a/compiler-rt/test/tsan/simulate_join_many_threads.cpp b/compiler-rt/test/tsan/simulate_join_many_threads.cpp
index f6599efe95d3d..731bd6f0309e2 100644
--- a/compiler-rt/test/tsan/simulate_join_many_threads.cpp
+++ b/compiler-rt/test/tsan/simulate_join_many_threads.cpp
@@ -7,7 +7,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 constexpr int kMaxLevels = 16;
 
diff --git a/compiler-rt/test/tsan/simulate_max_depth_hit.cpp b/compiler-rt/test/tsan/simulate_max_depth_hit.cpp
index cb38ad88a339c..e77f10641328c 100644
--- a/compiler-rt/test/tsan/simulate_max_depth_hit.cpp
+++ b/compiler-rt/test/tsan/simulate_max_depth_hit.cpp
@@ -5,7 +5,7 @@
 #include <atomic>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 static std::atomic<int> counter(0);
 
diff --git a/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp b/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
index 86714e99dbc1d..a9783ecc2ff02 100644
--- a/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
+++ b/compiler-rt/test/tsan/simulate_multiple_mutexes.cpp
@@ -4,7 +4,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 const int num_mutexes = 10;
 const int num_threads = 5;
diff --git a/compiler-rt/test/tsan/simulate_mutex_contention.cpp b/compiler-rt/test/tsan/simulate_mutex_contention.cpp
index b52d108a2cf27..3cf23f87f1360 100644
--- a/compiler-rt/test/tsan/simulate_mutex_contention.cpp
+++ b/compiler-rt/test/tsan/simulate_mutex_contention.cpp
@@ -3,7 +3,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mtx;
 int shared = 0;
diff --git a/compiler-rt/test/tsan/simulate_nested_create.cpp b/compiler-rt/test/tsan/simulate_nested_create.cpp
index 3bb696fbc24f6..c0649a8ff91ba 100644
--- a/compiler-rt/test/tsan/simulate_nested_create.cpp
+++ b/compiler-rt/test/tsan/simulate_nested_create.cpp
@@ -7,7 +7,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 int counter = 0;
diff --git a/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp b/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
index 3068eba4fb3d7..46c83d1b39502 100644
--- a/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
+++ b/compiler-rt/test/tsan/simulate_non_atomic_interleaved_rare_race.cpp
@@ -7,7 +7,7 @@
 #include <atomic>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 std::atomic<int> d{};
 int a = 0;
diff --git a/compiler-rt/test/tsan/simulate_probability.cpp b/compiler-rt/test/tsan/simulate_probability.cpp
index cc4b0b2e83e0b..6a7c5fbfd0686 100644
--- a/compiler-rt/test/tsan/simulate_probability.cpp
+++ b/compiler-rt/test/tsan/simulate_probability.cpp
@@ -8,7 +8,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 int counter = 0;
diff --git a/compiler-rt/test/tsan/simulate_race_basic.cpp b/compiler-rt/test/tsan/simulate_race_basic.cpp
index ebecb7d76bdbc..22459155e8d7b 100644
--- a/compiler-rt/test/tsan/simulate_race_basic.cpp
+++ b/compiler-rt/test/tsan/simulate_race_basic.cpp
@@ -3,7 +3,7 @@
 
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 int shared_var = 0;
 
diff --git a/compiler-rt/test/tsan/simulate_rare_race.cpp b/compiler-rt/test/tsan/simulate_rare_race.cpp
index dbf11d2bf4de3..da253a3c29ecf 100644
--- a/compiler-rt/test/tsan/simulate_rare_race.cpp
+++ b/compiler-rt/test/tsan/simulate_rare_race.cpp
@@ -9,7 +9,7 @@
 #include <atomic>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 struct TestData {
   pthread_mutex_t mtx;
diff --git a/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp b/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
index 49066c18b9d1a..3544598c59ecb 100644
--- a/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
+++ b/compiler-rt/test/tsan/simulate_schedule_between_joins.cpp
@@ -4,7 +4,7 @@
 #include <atomic>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void *thread_func(void *arg) { return nullptr; }
 
diff --git a/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp b/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
index abb35758036cc..1f136ae356020 100644
--- a/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
+++ b/compiler-rt/test/tsan/simulate_shared_mutex_unsupported.cpp
@@ -4,7 +4,7 @@
 #include <pthread.h>
 #include <stdlib.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_rwlock_t rwlock;
 
diff --git a/compiler-rt/test/tsan/simulate_sleep.cpp b/compiler-rt/test/tsan/simulate_sleep.cpp
index 9d89eb36dbb96..d5e8ac10f2ad4 100644
--- a/compiler-rt/test/tsan/simulate_sleep.cpp
+++ b/compiler-rt/test/tsan/simulate_sleep.cpp
@@ -5,7 +5,7 @@
 #include <time.h>
 #include <unistd.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 void *thread_func(void *arg) {
   sleep(1);
diff --git a/compiler-rt/test/tsan/simulate_spinlock.cpp b/compiler-rt/test/tsan/simulate_spinlock.cpp
index 77072385355cd..f1d68218d1f88 100644
--- a/compiler-rt/test/tsan/simulate_spinlock.cpp
+++ b/compiler-rt/test/tsan/simulate_spinlock.cpp
@@ -7,7 +7,7 @@
 #include <pthread.h>
 #include <stdlib.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_spinlock_t spinlock;
 
diff --git a/compiler-rt/test/tsan/simulate_start_iteration.cpp b/compiler-rt/test/tsan/simulate_start_iteration.cpp
index 3caab46181077..020208bf65183 100644
--- a/compiler-rt/test/tsan/simulate_start_iteration.cpp
+++ b/compiler-rt/test/tsan/simulate_start_iteration.cpp
@@ -5,7 +5,7 @@
 #include <pthread.h>
 #include <stdio.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 int counter = 0;
diff --git a/compiler-rt/test/tsan/simulate_stress_condvar.cpp b/compiler-rt/test/tsan/simulate_stress_condvar.cpp
index b75f5c6e73b4b..df57ef357bc7f 100644
--- a/compiler-rt/test/tsan/simulate_stress_condvar.cpp
+++ b/compiler-rt/test/tsan/simulate_stress_condvar.cpp
@@ -4,7 +4,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 pthread_cond_t condvar;
diff --git a/compiler-rt/test/tsan/simulate_stress_mutex.cpp b/compiler-rt/test/tsan/simulate_stress_mutex.cpp
index b5332e5902993..05f3930885d13 100644
--- a/compiler-rt/test/tsan/simulate_stress_mutex.cpp
+++ b/compiler-rt/test/tsan/simulate_stress_mutex.cpp
@@ -4,7 +4,7 @@
 #include <assert.h>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 int counter = 0;
diff --git a/compiler-rt/test/tsan/simulate_thread_detection.cpp b/compiler-rt/test/tsan/simulate_thread_detection.cpp
index bad65b032a039..8afe0dbb09982 100644
--- a/compiler-rt/test/tsan/simulate_thread_detection.cpp
+++ b/compiler-rt/test/tsan/simulate_thread_detection.cpp
@@ -6,7 +6,7 @@
 #include <atomic>
 #include <stdio.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 std::atomic<bool> keep_running(true);
 
diff --git a/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp b/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
index f955c91f3c4e1..44e767831a392 100644
--- a/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
+++ b/compiler-rt/test/tsan/simulate_thread_local_dtor.cpp
@@ -8,7 +8,7 @@
 #include <atomic>
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 std::atomic<int> ctor_count(0);
 std::atomic<int> dtor_count(0);
diff --git a/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp b/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
index 83dd939bf75f8..64795645fd63e 100644
--- a/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
+++ b/compiler-rt/test/tsan/simulate_timed_mutex_unsupported.cpp
@@ -8,7 +8,7 @@
 #include <stdlib.h>
 #include <time.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_mutex_t mutex;
 
diff --git a/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp b/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp
index bf1f9da9da61b..510ed6089cbaa 100644
--- a/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp
+++ b/compiler-rt/test/tsan/simulate_tsan_mutex_annotations.cpp
@@ -6,11 +6,7 @@
 
 #include <pthread.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
-extern "C" void __tsan_mutex_pre_lock(void *m, unsigned flagz);
-extern "C" void __tsan_mutex_post_lock(void *m, unsigned flagz, int rec);
-extern "C" int __tsan_mutex_pre_unlock(void *m, unsigned flagz);
-extern "C" void __tsan_mutex_post_unlock(void *m, unsigned flagz);
+#include <sanitizer/tsan_interface.h>
 
 int fake_mutex;
 
diff --git a/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp b/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp
index aa18b73d1ec7c..ce3181c485dad 100644
--- a/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp
+++ b/compiler-rt/test/tsan/simulate_unsupported_interceptor.cpp
@@ -4,7 +4,7 @@
 #include <pthread.h>
 #include <stdlib.h>
 
-extern "C" int __tsan_simulate(void (*callback)(void *arg), void *arg);
+#include <sanitizer/tsan_interface.h>
 
 pthread_rwlock_t rwlock;
 

>From 284ea42e53314d15e818fa104334908f01c96e15 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Sun, 14 Jun 2026 01:57:13 -0400
Subject: [PATCH 09/13] Add TODOs

---
 compiler-rt/lib/tsan/rtl/tsan_simulate.cpp | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
index b9880472a3fed..40ada3c492b48 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
@@ -9,6 +9,20 @@
 // This file is a part of ThreadSanitizer (TSan), a race detector.
 //
 //===----------------------------------------------------------------------===//
+//
+// TODO:
+//   - Support calling __tsan_simulate while other threads are running.
+//   - Support custom mutexes (e.g. absl::Mutex) and direct futex calls in
+//     simulation mode. Currently, only pthread mutexes and condition variables
+//     are supported; other synchronization primitives abort the simulation.
+//   - Support timed operations (pthread_cond_timedwait, pthread_mutex_timedlock,
+//     pthread_timedjoin_np) by treating them as their non-timed counterparts.
+//   - Support spinlocks (pthread_spin_lock/trylock/unlock) using the same
+//     trylock-loop pattern as pthread_mutex_lock.
+//   - Support rwlocks (pthread_rwlock_*) with reader/writer tracking in the
+//     scheduler.
+//   - Support barriers (pthread_barrier_wait) with a rendezvous mechanism in
+//     the scheduler.
 
 #include "tsan_simulate.h"
 

>From 82d58a368a40709578efbcd07c50a0a9f448a1d0 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Mon, 15 Jun 2026 10:02:36 -0400
Subject: [PATCH 10/13] Addressing feedback

 - Ensure thread leak detection works in simulation mode
 - Optimize RemoveOne
 - Remove extra init
---
 .gitignore                                    |  3 ++
 compiler-rt/lib/tsan/rtl/tsan_interface.cpp   |  1 -
 compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp  |  6 +--
 compiler-rt/lib/tsan/rtl/tsan_simulate.cpp    | 37 +++++++++++++---
 .../test/tsan/simulate_forgotten_join.cpp     | 42 +++++++++++++++++++
 5 files changed, 79 insertions(+), 10 deletions(-)
 create mode 100644 compiler-rt/test/tsan/simulate_forgotten_join.cpp

diff --git a/.gitignore b/.gitignore
index 9d4e86ab10caa..b9481a9461d9a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -88,3 +88,6 @@ pythonenv*
 /clang/utils/analyzer/projects/*/RefScanBuildResults
 # automodapi puts generated documentation files here.
 /lldb/docs/python_api/
+# test output files from lit
+compiler-rt/test/tsan/Output
+compiler-rt/test/tsan/Linux/Output
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
index f91e4873e7b6e..bde2f525035bd 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface.cpp
@@ -86,7 +86,6 @@ void __tsan_set_fiber_name(void *fiber, const char *name) {
 }  // extern "C"
 
 int __tsan_simulate(void (*callback)(void* arg), void* arg) {
-  Initialize(cur_thread_init());
   return SimulateRun(callback, arg);
 }
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
index 717041b9d3577..b5204a5c1abd8 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
@@ -238,9 +238,6 @@ void ThreadContext::OnStarted(void *arg) {
 
 void ThreadFinish(ThreadState *thr) {
   DPrintf("#%d: ThreadFinish\n", thr->tid);
-#if !SANITIZER_GO
-  SimulateThreadFinish();
-#endif
   ThreadCheckIgnore(thr);
   if (thr->stk_addr && thr->stk_size)
     DontNeedShadowFor(thr->stk_addr, thr->stk_size);
@@ -275,6 +272,9 @@ void ThreadFinish(ThreadState *thr) {
     ctx->dd->DestroyLogicalThread(thr->dd_lt);
   SlotDetach(thr);
   ctx->thread_registry.FinishThread(thr->tid);
+#if !SANITIZER_GO
+  SimulateThreadFinish();
+#endif
   thr->~ThreadState();
 }
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
index 40ada3c492b48..787dc0c0420d0 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
@@ -15,7 +15,8 @@
 //   - Support custom mutexes (e.g. absl::Mutex) and direct futex calls in
 //     simulation mode. Currently, only pthread mutexes and condition variables
 //     are supported; other synchronization primitives abort the simulation.
-//   - Support timed operations (pthread_cond_timedwait, pthread_mutex_timedlock,
+//   - Support timed operations (pthread_cond_timedwait,
+//   pthread_mutex_timedlock,
 //     pthread_timedjoin_np) by treating them as their non-timed counterparts.
 //   - Support spinlocks (pthread_spin_lock/trylock/unlock) using the same
 //     trylock-loop pattern as pthread_mutex_lock.
@@ -108,15 +109,11 @@ struct Waitset {
   }
 
   // Randomly select and remove one thread from the waitset.
-  // Matches Relacy's approach to maximize interleaving exploration.
   int RemoveOne(u32* rng_state) {
     CHECK_GT(count, 0);
-    // Pick a random thread from the waitset.
     int idx = RandN(rng_state, count);
     int thread_idx = waiters[idx];
-    // Remove it by shifting remaining threads.
-    for (int i = idx + 1; i < count; i++) waiters[i - 1] = waiters[i];
-    count--;
+    waiters[idx] = waiters[--count];
     return thread_idx;
   }
 
@@ -633,6 +630,34 @@ int CheckForErors(int iter, int start_iter) {
     return -1;
   }
 
+  // Check TSan's thread registry for threads that finished but were never
+  // joined or detached (thread leaks).
+  {
+    bool has_leak = false;
+    ThreadRegistryLock l(&ctx->thread_registry);
+    ctx->thread_registry.RunCallbackForEachThreadLocked(
+        [](ThreadContextBase* tctx_base, void* arg) {
+          auto* tctx = static_cast<ThreadContext*>(tctx_base);
+          if (tctx->detached || tctx->status != ThreadStatusFinished)
+            return;
+          *static_cast<bool*>(arg) = true;
+        },
+        &has_leak);
+    if (has_leak) {
+      Printf("ThreadSanitizer: thread leak detected at iteration %d\n", iter);
+      Printf(
+          "ThreadSanitizer: to reproduce, set "
+          "TSAN_OPTIONS=simulate_scheduler=random:simulate_start_iteration="
+          "%d\n",
+          iter);
+      Printf(
+          "ThreadSanitizer: simulation stopped due to thread leak after %d "
+          "iterations\n",
+          iter - start_iter + 1);
+      return -1;
+    }
+  }
+
   return 0;
 }
 
diff --git a/compiler-rt/test/tsan/simulate_forgotten_join.cpp b/compiler-rt/test/tsan/simulate_forgotten_join.cpp
new file mode 100644
index 0000000000000..71333a076a3f9
--- /dev/null
+++ b/compiler-rt/test/tsan/simulate_forgotten_join.cpp
@@ -0,0 +1,42 @@
+// RUN: %clangxx_tsan -O1 %s -o %t
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=9 %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-NOLEAK
+// RUN: %env_tsan_opts=atexit_sleep_ms=0:abort_on_error=0:simulate_scheduler=random:simulate_iterations=10 not %run %t 2>&1 | FileCheck %s --check-prefix=CHECK-LEAK
+
+// Verify simulation detects a thread that is never joined. On the 10th
+// iteration, the callback "forgets" to join one thread.
+
+#include <pthread.h>
+#include <stdio.h>
+
+#include <sanitizer/tsan_interface.h>
+
+static int global_count = 0;
+
+void *thread_func(void *arg) { return nullptr; }
+
+void test_callback(void *arg) {
+  ++global_count;
+
+  pthread_t t1, t2;
+  pthread_create(&t1, nullptr, thread_func, nullptr);
+  pthread_create(&t2, nullptr, thread_func, (void *)2);
+
+  pthread_join(t1, nullptr);
+  if (global_count != 10)
+    pthread_join(t2, nullptr);
+  // On iteration 10, t2 is never joined.
+}
+
+int main() {
+  int res = __tsan_simulate(test_callback, nullptr);
+  fprintf(stderr, "simulation result: %d\n", res);
+  return res;
+}
+
+// CHECK-NOLEAK: ThreadSanitizer: simulation starting
+// CHECK-NOLEAK: ThreadSanitizer: simulation finished (9 iterations)
+// CHECK-NOLEAK: simulation result: 0
+
+// CHECK-LEAK: ThreadSanitizer: thread leak detected at iteration 9
+// CHECK-LEAK: ThreadSanitizer: simulation stopped due to thread leak
+// CHECK-LEAK: simulation result: -1

>From 6e7871bb00839037d666343b7e687cd3630fc416 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Mon, 15 Jun 2026 11:07:39 -0400
Subject: [PATCH 11/13] Remove UNLIKELY hint for non-error condition

---
 compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 5a533825bc02d..6b27f2e069b88 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -383,7 +383,7 @@ struct BlockingCall {
 
 TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
   SCOPED_TSAN_INTERCEPTOR(sleep, sec);
-  if (UNLIKELY(SimulateIsActive())) {
+  if (SimulateIsActive()) {
     SimulateSchedule();
     return 0;
   }
@@ -394,7 +394,7 @@ TSAN_INTERCEPTOR(unsigned, sleep, unsigned sec) {
 
 TSAN_INTERCEPTOR(int, usleep, long_t usec) {
   SCOPED_TSAN_INTERCEPTOR(usleep, usec);
-  if (UNLIKELY(SimulateIsActive())) {
+  if (SimulateIsActive()) {
     SimulateSchedule();
     return 0;
   }
@@ -405,7 +405,7 @@ TSAN_INTERCEPTOR(int, usleep, long_t usec) {
 
 TSAN_INTERCEPTOR(int, nanosleep, void *req, void *rem) {
   SCOPED_TSAN_INTERCEPTOR(nanosleep, req, rem);
-  if (UNLIKELY(SimulateIsActive())) {
+  if (SimulateIsActive()) {
     SimulateSchedule();
     return 0;
   }

>From 63ad5163b04788f87e515940e23493e7ecf288d3 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Mon, 15 Jun 2026 11:17:19 -0400
Subject: [PATCH 12/13] Add one more TODO per feedback

---
 compiler-rt/lib/tsan/rtl/tsan_simulate.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
index 787dc0c0420d0..5e52c455e7e53 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_simulate.cpp
@@ -24,6 +24,7 @@
 //     scheduler.
 //   - Support barriers (pthread_barrier_wait) with a rendezvous mechanism in
 //     the scheduler.
+//   - Embed Waitset into the Sync object to replace the linear WaitsetMap.
 
 #include "tsan_simulate.h"
 

>From e57a69b720f4c040e38b1edf59158e7965f06585 Mon Sep 17 00:00:00 2001
From: Chris Cotter <ccotter14 at bloomberg.net>
Date: Mon, 15 Jun 2026 11:18:51 -0400
Subject: [PATCH 13/13] Update simulate_schedule_probability default

---
 compiler-rt/lib/tsan/rtl/tsan_flags.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_flags.inc b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
index 699471371dc03..2bfc67d467825 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_flags.inc
+++ b/compiler-rt/lib/tsan/rtl/tsan_flags.inc
@@ -109,7 +109,7 @@ TSAN_FLAG(
     "platform.")
 TSAN_FLAG(int, adaptive_delay_relaxed_sample_rate, 10000,
           "Sample 1 in N relaxed atomic operations for delay")
-TSAN_FLAG(int, adaptive_delay_sync_atomic_sample_rate, 100,
+TSAN_FLAG(int, adaptive_delay_sync_atomic_sample_rate, 20,
           "Sample 1 in N acquire/release/seq_cst atomic operations for delay")
 TSAN_FLAG(int, adaptive_delay_mutex_sample_rate, 10,
           "Sample 1 in N mutex/cv operations for delay")



More information about the cfe-commits mailing list