[compiler-rt] [llvm] [TSan] Add dominance-based redundant instrumentation elimination (PR #169897)
Alexey Paznikov via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 28 02:42:05 PST 2025
https://github.com/apaznikov created https://github.com/llvm/llvm-project/pull/169897
## Summary
This PR implements a static analysis pass to identify and eliminate redundant memory access instrumentation in ThreadSanitizer. By leveraging dominance and post-dominance relationships within the Control Flow Graph (CFG), we can prove that certain runtime checks are unnecessary because a data race would inevitably be detected by a preceding or succeeding check on the same execution path.
This work is part of a broader research effort on optimizing dynamic race detectors [1].
## Implementation Details
The logic is encapsulated in the `DominanceBasedElimination` class within `ThreadSanitizer.cpp`. The pass operates intra-procedurally and performs the following steps:
1. **Safety Analysis:** It builds a cache of "safe" basic blocks and paths. A path is considered safe if it contains no synchronization primitives (atomics, fences) and no function calls that could potentially synchronize threads (checked via function attributes in this patch).
2. **Dominance Elimination:** If instruction $I_{dom}$ dominates $I_{sub}$, both access the same memory location (verified via `MustAlias`), and the path $I_{dom} \to I_{sub}$ is safe, then $I_{sub}$ is not instrumented.
3. **Post-Dominance Elimination:** If instruction $I_{post}$ post-dominates $I_{pre}$, accesses the same location, and the path $I_{pre} \to I_{post}$ is safe, then $I_{pre}$ is not instrumented.
### Safety & Correctness
* **Aliasing:** We strictly require `MustAlias`. If AA returns `MayAlias` or `NoAlias`, optimization is skipped.
* **Loops:** For post-dominance, we disable elimination if the path passes through a loop header to avoid removing checks in code that might enter an infinite loop before reaching the post-dominator (we also do this if path contains calls, because functions may contain infinite loops). We also check whether along the path there are no instructions that could cause an irregular exit.
* **Write/Read semantics:** The logic respects TSan's read/write semantics (e.g., a Write can eliminate a subsequent Read, but a Read cannot eliminate a subsequent Write).
## Impact
* **Runtime Performance:** Reduces the number of `__tsan_read/write` calls, leading to lower runtime overhead for instrumented binaries.
* **Report Granularity:** In some cases, the reported race location might shift from the "second" access to the "first" (dominating) access. However, the presence of a race is guaranteed to be reported.
* **Compile Time:** The analysis uses standard LLVM passes (`DominatorTree`, `PostDominatorTree`, `LoopInfo`) and is lightweight.
## Motivation & Potential Impact
This work is based on our research [1] into optimizing dynamic race detectors. Our experiments identified **Dominance-based Elimination (DE)** as the single most effective optimization strategy compared to other techniques (Escape Analysis, Lock Ownership, etc.).
In our research prototype (which utilizes a more aggressive, inter-procedural version of this analysis), we observed the following speedups solely from DE:
* **SQLite:** ~1.67x speedup
* **Redis:** ~1.35x speedup
* **FFmpeg:** ~1.2x speedup
* **MySQL:** ~1.13x speedup
* **Memcached:** ~1.07x speedup
**Chromium:**
We evaluated the pass on **Chromium** using a suite of micro-benchmarks (including Layout, Parser, SVG, and Speedometer tests).
* **Median Speedup:** ~1.3x across all suites.
* **Distribution:** Over **80%** of all Chromium micro-benchmarks achieved a speedup of at least **1.2x**.
* Specific suites like 'Layout' showed median speedups around **1.5x**.
### Note on this PR:
This patch implements a **conservative, intra-procedural** version of the algorithm described in [1] to ensure maximum stability and soundness for the upstream compiler. While the absolute speedups for this initial version may be lower than the research prototype (due to the lack of IPA and conservative handling of synchronization), it targets the same redundancy patterns and serves as the foundational infrastructure for future improvements.
Even with intra-procedural analysis, we expect significant instrumentation reduction in hot loops and straight-line code with repeated accesses.
## Usage
The optimization is currently opt-in.
**Flag:** `-mllvm -tsan-use-dominance-analysis`
## Attribution & Status
**Implementation:**
This patch was implemented by **Alexey Paznikov**.
**Research & Algorithm Design:**
The underlying algorithms and performance validation were conducted by the research team: **Alexey Paznikov**, **Andrey Kogutenko**, **Yaroslav Osipov**, **Michael Schwarz**, and **Umang Mathur**.
This work is based on research currently **under review** for publication [1].
[1] "Optimizing Instrumentation for Data Race Detectors" (Under Review, 2025).
>From 77550a74769c43e535d8012a604c90cdcc00553c Mon Sep 17 00:00:00 2001
From: Alexey Paznikov <apaznikov at gmail.com>
Date: Mon, 15 Sep 2025 16:34:10 +0800
Subject: [PATCH] [TSan] Add dominance-based redundant instrumentation
elimination
This patch introduces a new optimization pass to ThreadSanitizer that eliminates redundant instrumentation of memory accesses using dominance and post-dominance analysis.
The optimization relies on the observation that if two memory accesses to the same location are executed on a path without intermediate synchronization (e.g., atomic operations, fences, or unknown function calls), checking one of them is often sufficient to detect a data race.
The algorithm works in two phases:
1. Dominance-based elimination: If access A dominates access B, they target the same memory location (MustAlias), and the path from A to B is free of synchronization, the instrumentation for B is redundant.
2. Post-dominance-based elimination: If access B post-dominates access A, they target the same location, and the path from A to B is safe, the instrumentation for A can be removed (since execution is guaranteed to reach B, where the race will be detected).
To ensure soundness and prevent false negatives:
- We use strict MustAlias results from AliasAnalysis.
- We perform a safety analysis to ensure no "dangerous" instructions (synchronization, unknown calls) exist between the accesses.
- For post-dominance, we conservatively avoid elimination across loops to prevent issues with non-terminating loops where the post-dominating access might never be reached.
This implementation is intra-procedural and utilizes standard LLVM analyses (DominatorTree, PostDominatorTree, AAResults, LoopInfo).
This optimization is disabled by default and can be enabled via `-tsan-use-dominance-analysis`.
This implementation is based on the research algorithm designed by:
Alexey Paznikov, Andrey Kogutenko, Yaroslav Osipov, Michael Schwarz, and Umang Mathur.
WIP: Add basic dominance-based algorithm implementation (research-based)
Add tests for TSan dominance-based optimization
Traverse all BBs between StartInst and EndInst (large redesign)
Optimize dominance-based elimination with caching (ReachableToEnd and ConeSafeCache) and reusable data structures, add relevant tests
Add tests for dominance-based elimination of dirty prefixes and irrelevant paths
Fix alias handling in TSan dominance-based elimination and add detailed tests for alias scenarios
Add dominance-based elimination flag for TSan compiler-rt tests
Fixes following by 31.10 review:
- Make is isMustAlias location/size-aware
- Consider volatile DomInst
- Fix Callee check in isInstrSafe
- Make ConeSafeCache work
- Cleanup
Disable post-dom optimization for each path containing loops (enabled by -tsan-postdom-aggressive flag)
Simplify, refactor, minor correctness fixes
Slightly simplified the `ThreadSanitizerPass::run()`.
One "if" instead of several ternary operators.
Revert "Add dominance-based elimination flag for TSan compiler-rt tests"
This reverts commit 3a8a84c56402504ea76eef2b2ffb852a9ac6b0ff.
Add check-tsan-dominance-analysis-* cmake targets
Disable post-dominance-based elimination with loop and unsafe call checks; extend caches and safety checks for post-dom analysis
Cleanup, typos
---
compiler-rt/test/tsan/CMakeLists.txt | 111 ++-
compiler-rt/test/tsan/lit.cfg.py | 8 +
compiler-rt/test/tsan/lit.site.cfg.py.in | 3 +
.../Instrumentation/ThreadSanitizer.cpp | 642 +++++++++++++++++-
.../ThreadSanitizer/dominance-elimination.ll | 540 +++++++++++++++
5 files changed, 1245 insertions(+), 59 deletions(-)
create mode 100644 llvm/test/Instrumentation/ThreadSanitizer/dominance-elimination.ll
diff --git a/compiler-rt/test/tsan/CMakeLists.txt b/compiler-rt/test/tsan/CMakeLists.txt
index 163355d68ebc2..e2f297a397625 100644
--- a/compiler-rt/test/tsan/CMakeLists.txt
+++ b/compiler-rt/test/tsan/CMakeLists.txt
@@ -18,6 +18,7 @@ endif()
set(TSAN_DYNAMIC_TEST_DEPS ${TSAN_TEST_DEPS})
set(TSAN_TESTSUITES)
set(TSAN_DYNAMIC_TESTSUITES)
+set(TSAN_ENABLE_DOMINANCE_ANALYSIS "False") # Disable dominance analysis by default
if (NOT DEFINED TSAN_TEST_DEFLAKE_THRESHOLD)
set(TSAN_TEST_DEFLAKE_THRESHOLD "10")
@@ -28,45 +29,77 @@ if(APPLE)
darwin_filter_host_archs(TSAN_SUPPORTED_ARCH TSAN_TEST_ARCH)
endif()
-foreach(arch ${TSAN_TEST_ARCH})
- set(TSAN_TEST_APPLE_PLATFORM "osx")
- set(TSAN_TEST_MIN_DEPLOYMENT_TARGET_FLAG "${DARWIN_osx_MIN_VER_FLAG}")
+# Unified function for generating TSAN test suites by architectures.
+# Arguments:
+# OUT_LIST_VAR - name of output list (for example, TSAN_TESTSUITES or TSAN_DOM_TESTSUITES)
+# SUFFIX_KIND - string added to config suffix after "-${arch}" (for example, "" or "-dominance")
+# CONFIG_KIND - string added to config name after "Config" (for example, "" or "Dominance")
+# ENABLE_DOM - "True"/"False" enable dominance analysis
+function(tsan_generate_arch_suites OUT_LIST_VAR SUFFIX_KIND CONFIG_KIND ENABLE_DOM)
+ foreach(arch ${TSAN_TEST_ARCH})
+ set(TSAN_ENABLE_DOMINANCE_ANALYSIS "${ENABLE_DOM}")
- set(TSAN_TEST_TARGET_ARCH ${arch})
- string(TOLOWER "-${arch}" TSAN_TEST_CONFIG_SUFFIX)
- get_test_cc_for_arch(${arch} TSAN_TEST_TARGET_CC TSAN_TEST_TARGET_CFLAGS)
+ set(TSAN_TEST_APPLE_PLATFORM "osx")
+ set(TSAN_TEST_MIN_DEPLOYMENT_TARGET_FLAG "${DARWIN_osx_MIN_VER_FLAG}")
- string(REPLACE ";" " " LIBDISPATCH_CFLAGS_STRING " ${COMPILER_RT_TEST_LIBDISPATCH_CFLAGS}")
- string(APPEND TSAN_TEST_TARGET_CFLAGS ${LIBDISPATCH_CFLAGS_STRING})
+ set(TSAN_TEST_TARGET_ARCH ${arch})
+ string(TOLOWER "-${arch}${SUFFIX_KIND}" TSAN_TEST_CONFIG_SUFFIX)
+ get_test_cc_for_arch(${arch} TSAN_TEST_TARGET_CC TSAN_TEST_TARGET_CFLAGS)
- if (COMPILER_RT_HAS_MSSE4_2_FLAG)
- string(APPEND TSAN_TEST_TARGET_CFLAGS " -msse4.2 ")
- endif()
+ string(REPLACE ";" " " LIBDISPATCH_CFLAGS_STRING " ${COMPILER_RT_TEST_LIBDISPATCH_CFLAGS}")
+ string(APPEND TSAN_TEST_TARGET_CFLAGS ${LIBDISPATCH_CFLAGS_STRING})
- string(TOUPPER ${arch} ARCH_UPPER_CASE)
- set(CONFIG_NAME ${ARCH_UPPER_CASE}Config)
+ if (COMPILER_RT_HAS_MSSE4_2_FLAG)
+ string(APPEND TSAN_TEST_TARGET_CFLAGS " -msse4.2 ")
+ endif()
- configure_lit_site_cfg(
- ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
- ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
- MAIN_CONFIG
- ${CMAKE_CURRENT_SOURCE_DIR}/lit.cfg.py
- )
- list(APPEND TSAN_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+ string(TOUPPER ${arch} ARCH_UPPER_CASE)
+ set(CONFIG_NAME ${ARCH_UPPER_CASE}Config${CONFIG_KIND})
- if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
- string(TOLOWER "-${arch}-${OS_NAME}-dynamic" TSAN_TEST_CONFIG_SUFFIX)
- set(CONFIG_NAME ${ARCH_UPPER_CASE}${OS_NAME}DynamicConfig)
configure_lit_site_cfg(
- ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
- ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
- MAIN_CONFIG
- ${CMAKE_CURRENT_SOURCE_DIR}/lit.cfg.py
+ ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
+ ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
+ MAIN_CONFIG
+ ${CMAKE_CURRENT_SOURCE_DIR}/lit.cfg.py
+ )
+ list(APPEND ${OUT_LIST_VAR} ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+
+ if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
+ # Dynamic runtime for corresponding variant
+ if("${SUFFIX_KIND}" STREQUAL "")
+ string(TOLOWER "-${arch}-${OS_NAME}-dynamic" TSAN_TEST_CONFIG_SUFFIX)
+ set(CONFIG_NAME ${ARCH_UPPER_CASE}${OS_NAME}DynamicConfig${CONFIG_KIND})
+ list(APPEND TSAN_DYNAMIC_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+ else()
+ string(TOLOWER "-${arch}-${OS_NAME}-dynamic${SUFFIX_KIND}" TSAN_TEST_CONFIG_SUFFIX)
+ set(CONFIG_NAME ${ARCH_UPPER_CASE}${OS_NAME}DynamicConfig${CONFIG_KIND})
+ # Track dynamic dominance-analysis suites separately for a dedicated target.
+ list(APPEND TSAN_DOM_DYNAMIC_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+ endif()
+ configure_lit_site_cfg(
+ ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
+ ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME}/lit.site.cfg.py
+ MAIN_CONFIG
+ ${CMAKE_CURRENT_SOURCE_DIR}/lit.cfg.py
)
- list(APPEND TSAN_DYNAMIC_TESTSUITES
- ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+ list(APPEND ${OUT_LIST_VAR} ${CMAKE_CURRENT_BINARY_DIR}/${CONFIG_NAME})
+ endif()
+ endforeach()
+
+ # Propagate the assembled list to the parent scope
+ set(${OUT_LIST_VAR} "${${OUT_LIST_VAR}}" PARENT_SCOPE)
+ if(DEFINED TSAN_DOM_DYNAMIC_TESTSUITES)
+ set(TSAN_DOM_DYNAMIC_TESTSUITES "${TSAN_DOM_DYNAMIC_TESTSUITES}" PARENT_SCOPE)
endif()
-endforeach()
+endfunction()
+
+# Default configuration
+set(TSAN_TESTSUITES)
+tsan_generate_arch_suites(TSAN_TESTSUITES "" "" "False")
+
+# Enable dominance analysis (check-tsan-dominance-analysis target)
+set(TSAN_DOM_TESTSUITES)
+tsan_generate_arch_suites(TSAN_DOM_TESTSUITES "-dominance" "Dominance" "True")
# iOS and iOS simulator test suites
# These are not added into "check-all", in order to run these tests, use
@@ -124,6 +157,10 @@ list(APPEND TSAN_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/Unit)
if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
list(APPEND TSAN_DYNAMIC_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/Unit/dynamic)
endif()
+list(APPEND TSAN_DOM_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/Unit)
+if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
+ list(APPEND TSAN_DOM_DYNAMIC_TESTSUITES ${CMAKE_CURRENT_BINARY_DIR}/Unit/dynamic)
+endif()
add_lit_testsuite(check-tsan "Running ThreadSanitizer tests"
${TSAN_TESTSUITES}
@@ -136,3 +173,17 @@ if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
EXCLUDE_FROM_CHECK_ALL
DEPENDS ${TSAN_DYNAMIC_TEST_DEPS})
endif()
+
+add_lit_testsuite(check-tsan-dominance-analysis "Running ThreadSanitizer tests (dominance analysis)"
+ ${TSAN_DOM_TESTSUITES}
+ DEPENDS ${TSAN_TEST_DEPS})
+set_target_properties(check-tsan-dominance-analysis PROPERTIES FOLDER "Compiler-RT Tests")
+
+# New target: dynamic + dominance analysis
+if(COMPILER_RT_TSAN_HAS_STATIC_RUNTIME)
+ add_lit_testsuite(check-tsan-dominance-analysis-dynamic "Running ThreadSanitizer tests (dynamic, dominance analysis)"
+ ${TSAN_DOM_DYNAMIC_TESTSUITES}
+ EXCLUDE_FROM_CHECK_ALL
+ DEPENDS ${TSAN_DYNAMIC_TEST_DEPS})
+ set_target_properties(check-tsan-dominance-analysis-dynamic PROPERTIES FOLDER "Compiler-RT Tests")
+endif()
diff --git a/compiler-rt/test/tsan/lit.cfg.py b/compiler-rt/test/tsan/lit.cfg.py
index 8803a7bda9aa5..1dfc8c8557cbb 100644
--- a/compiler-rt/test/tsan/lit.cfg.py
+++ b/compiler-rt/test/tsan/lit.cfg.py
@@ -56,6 +56,14 @@ def get_required_attr(config, attr_name):
+ extra_cflags
+ ["-I%s" % tsan_incdir]
)
+
+# Setup dominance-based elimination if enabled
+tsan_enable_dominance = getattr(config, "tsan_enable_dominance_analysis", "False") == "True"
+if tsan_enable_dominance:
+ config.name += " (dominance-analysis)"
+ dom_flags = [ "-mllvm", "-tsan-use-dominance-analysis" ]
+ clang_tsan_cflags += dom_flags
+
clang_tsan_cxxflags = (
config.cxx_mode_flags + clang_tsan_cflags + ["-std=c++11"] + ["-I%s" % tsan_incdir]
)
diff --git a/compiler-rt/test/tsan/lit.site.cfg.py.in b/compiler-rt/test/tsan/lit.site.cfg.py.in
index c6d453aaee26f..d1d265e0ec53e 100644
--- a/compiler-rt/test/tsan/lit.site.cfg.py.in
+++ b/compiler-rt/test/tsan/lit.site.cfg.py.in
@@ -9,6 +9,9 @@ config.target_cflags = "@TSAN_TEST_TARGET_CFLAGS@"
config.target_arch = "@TSAN_TEST_TARGET_ARCH@"
config.deflake_threshold = "@TSAN_TEST_DEFLAKE_THRESHOLD@"
+# Enable dominance analysis.
+config.tsan_enable_dominance_analysis = "@TSAN_ENABLE_DOMINANCE_ANALYSIS@"
+
# Load common config for all compiler-rt lit tests.
lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")
diff --git a/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
index fd0e9f18b61c9..5061c65a1ac0f 100644
--- a/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
@@ -24,10 +24,14 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringExtras.h"
+#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/CaptureTracking.h"
+#include "llvm/Analysis/LoopInfo.h"
+#include "llvm/Analysis/PostDominators.h"
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/DataLayout.h"
+#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"
@@ -84,6 +88,15 @@ static cl::opt<bool>
ClOmitNonCaptured("tsan-omit-by-pointer-capturing", cl::init(true),
cl::desc("Omit accesses due to pointer capturing"),
cl::Hidden);
+static cl::opt<bool>
+ ClUseDominanceAnalysis("tsan-use-dominance-analysis", cl::init(false),
+ cl::desc("Eliminate duplicating instructions which "
+ "(post)dominate given instruction"),
+ cl::Hidden);
+static cl::opt<bool> ClPostDomAggressive(
+ "tsan-postdom-aggressive", cl::init(false),
+ cl::desc("Allow post-dominance elimination across loops (unsafe)"),
+ cl::Hidden);
STATISTIC(NumInstrumentedReads, "Number of instrumented reads");
STATISTIC(NumInstrumentedWrites, "Number of instrumented writes");
@@ -96,11 +109,172 @@ STATISTIC(NumOmittedReadsFromConstantGlobals,
"Number of reads from constant globals");
STATISTIC(NumOmittedReadsFromVtable, "Number of vtable reads");
STATISTIC(NumOmittedNonCaptured, "Number of accesses ignored due to capturing");
+STATISTIC(NumOmittedByDominance,
+ "Number of accesses ignored due to dominance");
+STATISTIC(NumOmittedByPostDominance,
+ "Number of accesses ignored due to post-dominance");
const char kTsanModuleCtorName[] = "tsan.module_ctor";
const char kTsanInitName[] = "__tsan_init";
namespace {
+// Internal Instruction wrapper that contains more information about the
+// Instruction from prior analysis.
+struct InstructionInfo {
+ // Instrumentation emitted for this instruction is for a compounded set of
+ // read and write operations in the same basic block.
+ static constexpr unsigned kCompoundRW = (1U << 0);
+
+ explicit InstructionInfo(Instruction *Inst) : Inst(Inst) {}
+
+ bool isWriteOperation() const {
+ return isa<StoreInst>(Inst) || (Flags & kCompoundRW);
+ }
+
+ Instruction *Inst;
+ unsigned Flags = 0;
+};
+
+/// A helper class to encapsulate the logic for eliminating redundant
+/// instrumentation based on dominance analysis.
+///
+/// This class takes a list of all accesses instructions that are candidates
+/// for instrumentation. It prunes instructions that are (post-)dominated by
+/// another access to the same memory location, provided that the path between
+/// them is "clear" of any dangerous instructions (like function calls or
+/// synchronization primitives).
+class DominanceBasedElimination {
+public:
+ /// \param AllInstr The vector of instructions to analyze. This vector is
+ /// modified in-place.
+ /// \param DT The Dominator Tree for the current function.
+ /// \param PDT The Post-Dominator Tree for the current function.
+ /// \param AA The results of Alias Analysis.
+ DominanceBasedElimination(SmallVectorImpl<InstructionInfo> &AllInstr,
+ DominatorTree &DT, PostDominatorTree &PDT,
+ AAResults &AA, LoopInfo &LI)
+ : AllInstr(AllInstr), DT(DT), PDT(PDT), AA(AA), LI(LI) {
+ // Build per-function basic-block safety cache once
+ if (!AllInstr.empty() && AllInstr.front().Inst) {
+ Function *F = AllInstr.front().Inst->getFunction();
+ BSC.ReachableToEnd.reserve(F->size());
+ BSC.ConeSafeCache.reserve(F->size());
+ buildBlockSafetyCache(*F);
+ }
+ }
+
+ /// Runs the analysis and prunes redundant instructions.
+ /// It sequentially applies elimination based on dominance and post-dominance.
+ void run() {
+ eliminate</*IsPostDom=*/false>(); // Dominance-based elimination
+ eliminate</*IsPostDom=*/true>(); // Post-dominance-based elimination
+ }
+
+private:
+ /// Per-function precomputation cache: instruction indices within BB and
+ /// positions of "dangerous" instructions.
+ struct BlockSafetyCache {
+ DenseMap<const Instruction *, unsigned> IndexInBB;
+
+ DenseMap<const BasicBlock *, SmallVector<unsigned, 4>> DangerIdxInBB;
+ DenseMap<const BasicBlock *, bool> HasDangerInBB;
+
+ DenseMap<const BasicBlock *, SmallVector<unsigned, 4>> DangerIdxInBBPostDom;
+ DenseMap<const BasicBlock *, bool> HasDangerInBBPostDom;
+
+ // Reachability cache: a set of blocks that can reach EndBB.
+ DenseMap<const BasicBlock *, SmallPtrSet<const BasicBlock *, 32>>
+ ReachableToEnd;
+ // Cone safety cache: StartBB -> (EndBB -> pathIsSafe): to avoid custom hash
+ DenseMap<const BasicBlock *,
+ DenseMap<const BasicBlock *, std::pair<bool, bool>>>
+ ConeSafeCache;
+ } BSC;
+
+ // Reusable worklists/visited sets to amortize allocations.
+ SmallVector<const BasicBlock *, 32> Worklist;
+ SmallPtrSet<const BasicBlock *, 32> CanReachSet;
+
+ void buildBlockSafetyCache(Function &F);
+
+ /// Check that suffix (after FromIdx) in BB contains no unsafe instruction.
+ bool suffixSafe(const BasicBlock *BB, unsigned FromIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>>
+ &DangerIdxInBB) const;
+
+ /// Check that prefix (before ToIdx) in BB contains no unsafe instruction.
+ bool prefixSafe(const BasicBlock *BB, unsigned ToIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>>
+ &DangerIdxInBB) const;
+
+ /// Check that (FromIdx, ToExclusiveIdx) interval inside a single BB is safe.
+ bool intervalSafeSameBB(
+ const BasicBlock *BB, unsigned FromIdx, unsigned ToExclusiveIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>>
+ &DangerIdxInBB) const;
+
+ /// Checks if an instruction is "dangerous" from TSan's perspective.
+ /// Dangerous instructions include function calls, atomics, and fences.
+ ///
+ /// \param Inst The instruction to check.
+ /// \return true if the instruction is dangerous.
+ static bool isInstrSafe(const Instruction *Inst);
+
+ /// For post-dominance, need to check whether the path contains loops,
+ /// irregular exits or unsafe calls.
+ static bool isInstrSafeForPostDom(const Instruction *I);
+
+ /// Find BBs which can reach EndBB
+ SmallPtrSet<const BasicBlock *, 32> buildCanReachEnd(const BasicBlock *EndBB);
+
+ /// Forward traversal from StartBB, restricted to the cone that reach EndBB.
+ /// In post-dom mode additionally rejects paths that go through any loop BB.
+ std::pair<bool, bool> traverseReachableAndCheckSafety(
+ const BasicBlock *StartBB, const BasicBlock *EndBB,
+ const SmallPtrSetImpl<const BasicBlock *> &CanReachEnd);
+
+ /// Checks if the path between two instructions is "clear", i.e., it does not
+ /// contain any dangerous instructions that could alter the thread
+ /// synchronization state.
+ /// \param StartInst The starting instruction (dominates for Dom, is dominated
+ /// for PostDom).
+ /// \param EndInst The ending instruction (is dominated for Dom,
+ /// post-dominates for PostDom).
+ /// \param DTBase DominatorTree (for Dom) or PostDominatorTree (for PostDom).
+ /// \return true if the path is clear.
+ template <bool IsPostDom>
+ bool isPathClear(Instruction *StartInst, Instruction *EndInst,
+ const DominatorTreeBase<BasicBlock, IsPostDom> *DTBase);
+
+ /// A helper function to create a map from Instruction* to its index
+ /// in the AllInstr vector for fast lookups.
+ DenseMap<Instruction *, size_t> createInstrToIndexMap() const;
+
+ /// Attempts to find a dominating instruction that can eliminate the need to
+ /// instrument instruction i
+ /// \param DTBase The dominator (post-dominator) tree being used
+ /// \param InstrToIndexMap Maps instructions to their indices in the AllInstr
+ /// \param ToRemove Vector tracking which instructions can be eliminated
+ /// \returns true if a dominating instruction was found that eliminates i
+ template <bool IsPostDom>
+ bool findAndMarkDominatingInstr(
+ size_t i, const DominatorTreeBase<BasicBlock, IsPostDom> *DTBase,
+ const DenseMap<Instruction *, size_t> &InstrToIndexMap,
+ SmallVectorImpl<bool> &ToRemove);
+
+ /// The core elimination logic. Templated to work with both Dominators
+ /// and Post-Dominators.
+ template <bool IsPostDom> void eliminate();
+
+ /// A reference to the vector of instructions that we modify.
+ SmallVectorImpl<InstructionInfo> &AllInstr;
+
+ /// References to the required analysis results.
+ DominatorTree &DT;
+ PostDominatorTree &PDT;
+ AAResults &AA;
+ LoopInfo &LI;
+};
/// ThreadSanitizer: instrument the code in module to find races.
///
@@ -109,7 +283,9 @@ namespace {
/// ensures the __tsan_init function is in the list of global constructors for
/// the module.
struct ThreadSanitizer {
- ThreadSanitizer() {
+ ThreadSanitizer(const TargetLibraryInfo &TLI, DominatorTree *DT,
+ PostDominatorTree *PDT, AAResults *AA, LoopInfo *LI)
+ : TLI(TLI), DT(DT), PDT(PDT), AA(AA), LI(LI) {
// Check options and warn user.
if (ClInstrumentReadBeforeWrite && ClCompoundReadBeforeWrite) {
errs()
@@ -118,21 +294,9 @@ struct ThreadSanitizer {
}
}
- bool sanitizeFunction(Function &F, const TargetLibraryInfo &TLI);
+ bool sanitizeFunction(Function &F);
private:
- // Internal Instruction wrapper that contains more information about the
- // Instruction from prior analysis.
- struct InstructionInfo {
- // Instrumentation emitted for this instruction is for a compounded set of
- // read and write operations in the same basic block.
- static constexpr unsigned kCompoundRW = (1U << 0);
-
- explicit InstructionInfo(Instruction *Inst) : Inst(Inst) {}
-
- Instruction *Inst;
- unsigned Flags = 0;
- };
void initialize(Module &M, const TargetLibraryInfo &TLI);
bool instrumentLoadOrStore(const InstructionInfo &II, const DataLayout &DL);
@@ -145,6 +309,12 @@ struct ThreadSanitizer {
int getMemoryAccessFuncIndex(Type *OrigTy, Value *Addr, const DataLayout &DL);
void InsertRuntimeIgnores(Function &F);
+ const TargetLibraryInfo &TLI;
+ DominatorTree *DT = nullptr;
+ PostDominatorTree *PDT = nullptr;
+ AAResults *AA = nullptr;
+ LoopInfo *LI = nullptr;
+
Type *IntptrTy;
FunctionCallee TsanFuncEntry;
FunctionCallee TsanFuncExit;
@@ -174,6 +344,413 @@ struct ThreadSanitizer {
FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
};
+//-----------------------------------------------------------------------------
+// DominanceBasedElimination Implementation
+//-----------------------------------------------------------------------------
+
+void DominanceBasedElimination::buildBlockSafetyCache(Function &F) {
+ // Reserve to reduce rehashing for a typical case.
+ BSC.DangerIdxInBB.reserve(F.size());
+ BSC.HasDangerInBB.reserve(F.size());
+ BSC.DangerIdxInBBPostDom.reserve(F.size());
+ BSC.HasDangerInBBPostDom.reserve(F.size());
+
+ for (BasicBlock &BB : F) {
+ SmallVector<unsigned, 4> Danger;
+ SmallVector<unsigned, 4> DangerForPostDom;
+ unsigned Idx = 0;
+ for (Instruction &I : BB) {
+ if (!isInstrSafe(&I))
+ Danger.push_back(Idx);
+ if (!isInstrSafeForPostDom(&I))
+ DangerForPostDom.push_back(Idx);
+ BSC.IndexInBB[&I] = Idx++;
+ }
+ BSC.HasDangerInBB[&BB] = !Danger.empty();
+ // Already in order by linear scan.
+ BSC.DangerIdxInBB[&BB] = std::move(Danger);
+
+ BSC.HasDangerInBBPostDom[&BB] = !DangerForPostDom.empty();
+
+ // Additional check for postdom: if the path contains loops
+ if (LI.getLoopFor(&BB) != nullptr) {
+ BSC.HasDangerInBBPostDom[&BB] = true;
+ DangerForPostDom.push_back(BB.size() - 1);
+ }
+ BSC.DangerIdxInBBPostDom[&BB] = std::move(DangerForPostDom);
+ }
+}
+
+// Check that suffix (after index FromIdx) in the BB contains no dangerous
+// instruction.
+bool DominanceBasedElimination::suffixSafe(
+ const BasicBlock *BB, unsigned FromIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>> &DangerIdxInBB)
+ const {
+ const auto It = DangerIdxInBB.find(BB);
+ if (It == DangerIdxInBB.end() || It->second.empty())
+ return true;
+ const auto &DangerIdx = It->second;
+ // First dangerous index >= FromIdx?
+ const auto LB = std::lower_bound(DangerIdx.begin(), DangerIdx.end(), FromIdx);
+ return LB == DangerIdx.end();
+}
+
+// Check that prefix (before index ToIdx) of the BB contains no dangerous
+// instruction.
+bool DominanceBasedElimination::prefixSafe(
+ const BasicBlock *BB, unsigned ToIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>> &DangerIdxInBB)
+ const {
+ const auto It = DangerIdxInBB.find(BB);
+ if (It == DangerIdxInBB.end() || It->second.empty())
+ return true;
+ const auto &DangerIdx = It->second;
+ // Any dangerous index < ToIdx?
+ const auto LB = std::lower_bound(DangerIdx.begin(), DangerIdx.end(), ToIdx);
+ return LB == DangerIdx.begin();
+}
+
+bool DominanceBasedElimination::intervalSafeSameBB(
+ const BasicBlock *BB, unsigned FromIdx, unsigned ToExclusiveIdx,
+ const DenseMap<const BasicBlock *, SmallVector<unsigned, 4>> &DangerIdxInBB)
+ const {
+ const auto It = DangerIdxInBB.find(BB);
+ if (It == DangerIdxInBB.end() || It->second.empty())
+ return true;
+ const auto &DangerIdx = It->second;
+ const auto LB = std::lower_bound(DangerIdx.begin(), DangerIdx.end(), FromIdx);
+ if (LB == DangerIdx.end())
+ return true;
+ return *LB >= ToExclusiveIdx;
+}
+
+bool isTsanAtomic(const Instruction *I) {
+ // TODO: Ask TTI whether synchronization scope is between threads.
+ auto SSID = getAtomicSyncScopeID(I);
+ if (!SSID)
+ return false;
+ if (isa<LoadInst>(I) || isa<StoreInst>(I))
+ return *SSID != SyncScope::SingleThread;
+ return true;
+}
+
+bool DominanceBasedElimination::isInstrSafe(const Instruction *Inst) {
+ // Atomic operations with inter-thread communication are the primary
+ // source of synchronization and are never safe.
+ if (isTsanAtomic(Inst))
+ return false;
+
+ // Check function calls, if it's known to be sync-free
+ if (const auto *CB = dyn_cast<CallBase>(Inst)) {
+ if (const Function *Callee = CB->getCalledFunction())
+ return Callee->hasNoSync();
+ return false;
+ }
+ // All other instructions are considered safe because they do not,
+ // by themselves, create happens-before relationships
+ return true;
+}
+
+bool DominanceBasedElimination::isInstrSafeForPostDom(const Instruction *I) {
+ // Irregular exits (e.g. return, abort, exceptions) and function calls
+ // (potential infinite loops) make post-dominance elimination unsafe.
+ if (isa<ReturnInst>(I) || isa<ResumeInst>(I))
+ return false;
+
+ if (const auto *CB = dyn_cast<CallBase>(I)) {
+ // Intrinsics are generally safe (no loops/exits hidden inside).
+ if (isa<IntrinsicInst>(CB))
+ return true;
+
+ if (const Function *Callee = CB->getCalledFunction()) {
+ if (Callee->hasFnAttribute(Attribute::WillReturn) &&
+ Callee->hasFnAttribute(Attribute::NoUnwind))
+ return true;
+ }
+ return false;
+ }
+ return true;
+}
+
+SmallPtrSet<const BasicBlock *, 32>
+DominanceBasedElimination::buildCanReachEnd(const BasicBlock *EndBB) {
+ // Check the cache first.
+ if (const auto CachedIt = BSC.ReachableToEnd.find(EndBB);
+ CachedIt != BSC.ReachableToEnd.end())
+ return CachedIt->second;
+
+ // Reuse VisitedSet as the reachability set.
+ Worklist.clear();
+ CanReachSet.clear();
+
+ CanReachSet.insert(EndBB);
+ Worklist.push_back(EndBB);
+ while (!Worklist.empty()) {
+ const BasicBlock *BB = Worklist.back();
+ Worklist.pop_back();
+ for (const BasicBlock *Pred : predecessors(BB)) {
+ if (CanReachSet.insert(Pred).second)
+ Worklist.push_back(Pred);
+ }
+ }
+
+ // Store in the cache and return a copy.
+ BSC.ReachableToEnd[EndBB] = CanReachSet;
+ return BSC.ReachableToEnd[EndBB];
+}
+
+std::pair<bool, bool>
+DominanceBasedElimination::traverseReachableAndCheckSafety(
+ const BasicBlock *StartBB, const BasicBlock *EndBB,
+ const SmallPtrSetImpl<const BasicBlock *> &CanReachEnd) {
+ Worklist.clear();
+ CanReachSet.clear();
+
+ auto enqueueNonVisited = [&](const BasicBlock *BB) {
+ if ((BB != EndBB) && CanReachSet.insert(BB).second)
+ Worklist.push_back(BB);
+ };
+
+ for (const BasicBlock *Succ : successors(StartBB)) {
+ if (CanReachEnd.count(Succ))
+ enqueueNonVisited(Succ);
+ }
+
+ bool DomSafety = true, PostDomSafety = true;
+
+ while (!Worklist.empty()) {
+ const BasicBlock *BB = Worklist.pop_back_val();
+
+ // Post-dom safety: any intermediate BB that is part of a loop
+ // makes elimination unsafe (potential infinite loop).
+ if (!ClPostDomAggressive && PostDomSafety &&
+ BSC.HasDangerInBBPostDom.lookup(BB))
+ PostDomSafety = false;
+
+ // Any dangerous instruction in an intermediate BB makes the path “dirty”.
+ if (DomSafety && BSC.HasDangerInBB.lookup(BB))
+ DomSafety = false;
+
+ if (!DomSafety && !PostDomSafety)
+ break;
+
+ for (const BasicBlock *Succ : successors(BB))
+ if (CanReachEnd.contains(Succ))
+ enqueueNonVisited(Succ);
+ }
+ return {DomSafety, PostDomSafety};
+}
+
+template <bool IsPostDom>
+bool DominanceBasedElimination::isPathClear(
+ Instruction *StartInst, Instruction *EndInst,
+ const DominatorTreeBase<BasicBlock, IsPostDom> *DTBase) {
+ LLVM_DEBUG(dbgs() << "Checking path from " << *StartInst << " to " << *EndInst
+ << "\t(" << (IsPostDom ? "PostDom" : "Dom") << ")\n");
+ const BasicBlock *StartBB = StartInst->getParent();
+ const BasicBlock *EndBB = EndInst->getParent();
+
+ // Intra-block indices (used in either case).
+ const unsigned StartIdx = BSC.IndexInBB.lookup(StartInst);
+ const unsigned EndIdx = BSC.IndexInBB.lookup(EndInst);
+
+ // Intra-BB: verify (StartInst; EndInst) is safe.
+ if (StartBB == EndBB) {
+ bool DomSafety =
+ intervalSafeSameBB(StartBB, StartIdx + 1, EndIdx, BSC.DangerIdxInBB);
+ if constexpr (IsPostDom) {
+ return DomSafety && intervalSafeSameBB(StartBB, StartIdx + 1, EndIdx,
+ BSC.DangerIdxInBBPostDom);
+ }
+ return DomSafety;
+ }
+
+ // Quick local checks on edges.
+ bool DomSafety = suffixSafe(StartBB, StartIdx + 1, BSC.DangerIdxInBB) &&
+ prefixSafe(EndBB, EndIdx, BSC.DangerIdxInBB);
+ if (!DomSafety)
+ return false;
+ if constexpr (IsPostDom) {
+ bool PostDomSafety =
+ suffixSafe(StartBB, StartIdx + 1, BSC.DangerIdxInBBPostDom) &&
+ prefixSafe(EndBB, EndIdx, BSC.DangerIdxInBBPostDom);
+ if (!PostDomSafety)
+ return false;
+ }
+
+ // Cone safety cache lookup.
+ if (const auto OuterIt = BSC.ConeSafeCache.find(StartBB);
+ OuterIt != BSC.ConeSafeCache.end()) {
+ if (const auto InnerIt = OuterIt->second.find(EndBB);
+ InnerIt != OuterIt->second.end()) {
+ const auto &[DomSafe, PostDomSafe] = InnerIt->second;
+ if (IsPostDom)
+ return DomSafe && PostDomSafe;
+ return DomSafe;
+ }
+ }
+
+ // Build the set of blocks that can reach EndBB (reverse traversal).
+ const auto CanReachEnd = buildCanReachEnd(EndBB);
+
+ // Forward traversal from StartBB, restricted to the cone that reach EndBB.
+ const auto [DomSafe, PostDomSafe] = traverseReachableAndCheckSafety(StartBB, EndBB, CanReachEnd);
+ BSC.ConeSafeCache[StartBB][EndBB] = {DomSafe, PostDomSafe};
+ LLVM_DEBUG(dbgs() << "isPathClear (DomSafe): " << (DomSafe ? "true" : "false")
+ << "\nisPathClear (PostDomSafe): "
+ << (PostDomSafe ? "true" : "false") << "\n");
+ if constexpr (IsPostDom)
+ return DomSafe && PostDomSafe;
+ return DomSafe;
+}
+
+DenseMap<Instruction *, size_t>
+DominanceBasedElimination::createInstrToIndexMap() const {
+ DenseMap<Instruction *, size_t> InstrToIndexMap;
+ InstrToIndexMap.reserve(AllInstr.size());
+ for (size_t i = 0; i < AllInstr.size(); ++i)
+ InstrToIndexMap[AllInstr[i].Inst] = i;
+ return InstrToIndexMap;
+}
+
+template <bool IsPostDom>
+bool DominanceBasedElimination::findAndMarkDominatingInstr(
+ size_t i, const DominatorTreeBase<BasicBlock, IsPostDom> *DTBase,
+ const DenseMap<Instruction *, size_t> &InstrToIndexMap,
+ SmallVectorImpl<bool> &ToRemove) {
+ LLVM_DEBUG(dbgs() << "\nAnalyzing: " << *(AllInstr[i].Inst) << "\n");
+ const InstructionInfo &CurrII = AllInstr[i];
+ Instruction *CurrInst = CurrII.Inst;
+ const BasicBlock *CurrBB = CurrInst->getParent();
+
+ const DomTreeNode *CurrDTNode = DTBase->getNode(CurrBB);
+ if (!CurrDTNode)
+ return false;
+
+ // Traverse up the dominator tree
+ for (const auto *IDomNode = CurrDTNode; IDomNode;
+ IDomNode = IDomNode->getIDom()) {
+ const BasicBlock *DomBB = IDomNode->getBlock();
+ if (!DomBB) break;
+
+ // Look for a suitable dominating instrumented instruction in DomBB
+ auto StartIt = DomBB->begin();
+ auto EndIt = DomBB->end();
+ if (CurrBB == DomBB) { // We are at the same BB
+ if constexpr (IsPostDom)
+ StartIt = std::next(CurrInst->getIterator());
+ else
+ EndIt = CurrInst->getIterator();
+ }
+
+ for (auto InstIt = StartIt; InstIt != EndIt; ++InstIt) {
+ const Instruction &PotentialDomInst = *InstIt;
+ LLVM_DEBUG(dbgs() << "PotentialDomInst: " << PotentialDomInst << "\n");
+
+ // Check if PotentialDomInst is dominating and instrumented
+ const auto It = InstrToIndexMap.find(&PotentialDomInst);
+ if (It == InstrToIndexMap.end() || ToRemove[It->second])
+ continue; // Not found in AllInstr or already marked for removal
+
+ const size_t DomIndex = It->second;
+ InstructionInfo &DomII = AllInstr[DomIndex];
+ Instruction *DomInst = DomII.Inst;
+
+ auto IsVolatile = [](const Instruction *I) {
+ if (const auto *L = dyn_cast<LoadInst>(I)) return L->isVolatile();
+ if (const auto *S = dyn_cast<StoreInst>(I)) return S->isVolatile();
+ return false;
+ };
+ if (ClDistinguishVolatile && IsVolatile(DomInst))
+ continue;
+
+ if (AA.isMustAlias(MemoryLocation::get(CurrInst),
+ MemoryLocation::get(DomInst))) {
+ const bool CurrIsWrite = CurrII.isWriteOperation();
+ const bool DomIsWrite = DomII.isWriteOperation();
+
+ // Check compatibility logic (DomInst covers CurrInst):
+ // 1. If DomInst is a 'write', it covers both read and write.
+ // 2. If DomInst is a 'read', it only covers a read.
+ if (DomIsWrite || !CurrIsWrite) {
+ // Check the path to/from CurrInst from/to DomInst
+ Instruction *PathStart = IsPostDom ? CurrInst : DomInst;
+ Instruction *PathEnd = IsPostDom ? DomInst : CurrInst;
+
+ if (isPathClear<IsPostDom>(PathStart, PathEnd, DTBase)) {
+ LLVM_DEBUG(dbgs()
+ << "TSAN: Omitting instrumentation for: " << *CurrInst
+ << " ((post-)dominated and covered by: " << *DomInst
+ << ")\n");
+ ToRemove[i] = true;
+ // Found a (post)dominator, move to the next Inst
+ return true;
+ }
+ }
+ }
+ }
+ }
+ return false;
+}
+
+/// Eliminates redundant instrumentation based on (pre/post)dominance analysis.
+/// \tparam IsPostDom If true, uses post-dominance; if false, uses dominance.
+template <bool IsPostDom> void DominanceBasedElimination::eliminate() {
+ LLVM_DEBUG(dbgs() << "Starting " << (IsPostDom ? "post-" : "")
+ << "dominance-based analysis\n");
+ if (AllInstr.empty())
+ return;
+
+ DominatorTreeBase<BasicBlock, IsPostDom> *DTBase;
+ if constexpr (IsPostDom)
+ DTBase = &PDT;
+ else
+ DTBase = &DT;
+
+ SmallVector<bool, 16> ToRemove(AllInstr.size(), false);
+ unsigned RemovedCount = 0;
+
+ // Create a map from Instruction* to its index in the AllInstr vector.
+ DenseMap<Instruction *, size_t> InstrToIndexMap = createInstrToIndexMap();
+
+ for (size_t i = 0; i < AllInstr.size(); ++i) {
+ if (ToRemove[i])
+ continue;
+
+ if (findAndMarkDominatingInstr<IsPostDom>(i, DTBase, InstrToIndexMap,
+ ToRemove))
+ RemovedCount++;
+ }
+
+ LLVM_DEBUG(dbgs() << "\nFinal list of instructions and their status\n";
+ for (size_t i = 0; i < AllInstr.size(); ++i) dbgs()
+ << "[" << (ToRemove[i] ? "REMOVED" : "KEPT") << "]\t"
+ << *AllInstr[i].Inst << "\n");
+
+ if (RemovedCount > 0) {
+ LLVM_DEBUG(dbgs() << "\n=== Updating final instruction list ===\n"
+ << "Original size: " << AllInstr.size() << "\n"
+ << "Instructions to remove: " << RemovedCount << "\n"
+ << "Remaining instructions: "
+ << (AllInstr.size() - RemovedCount) << "\n");
+ auto ToRemoveIter = ToRemove.begin();
+ erase_if(AllInstr, [&](const InstructionInfo &) {
+ return *ToRemoveIter++;
+ });
+
+ if constexpr (IsPostDom)
+ NumOmittedByPostDominance += RemovedCount;
+ else
+ NumOmittedByDominance += RemovedCount;
+ }
+ LLVM_DEBUG(dbgs() << "Dominance analysis complete\n");
+}
+
+//-----------------------------------------------------------------------------
+// ThreadSanitizer Implementation
+//-----------------------------------------------------------------------------
+
void insertModuleCtor(Module &M) {
getOrCreateSanitizerCtorAndInitFunctions(
M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{},
@@ -186,8 +763,21 @@ void insertModuleCtor(Module &M) {
PreservedAnalyses ThreadSanitizerPass::run(Function &F,
FunctionAnalysisManager &FAM) {
- ThreadSanitizer TSan;
- if (TSan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F)))
+ DominatorTree *DT = nullptr;
+ PostDominatorTree *PDT = nullptr;
+ AAResults *AA = nullptr;
+ LoopInfo *LI = nullptr;
+
+ if (ClUseDominanceAnalysis) {
+ DT = &FAM.getResult<DominatorTreeAnalysis>(F);
+ PDT = &FAM.getResult<PostDominatorTreeAnalysis>(F);
+ AA = &FAM.getResult<AAManager>(F);
+ LI = &FAM.getResult<LoopAnalysis>(F);
+ }
+
+ ThreadSanitizer TSan(FAM.getResult<TargetLibraryAnalysis>(F), DT, PDT, AA,
+ LI);
+ if (TSan.sanitizeFunction(F))
return PreservedAnalyses::none();
return PreservedAnalyses::all();
}
@@ -474,16 +1064,6 @@ void ThreadSanitizer::chooseInstructionsToInstrument(
Local.clear();
}
-static bool isTsanAtomic(const Instruction *I) {
- // TODO: Ask TTI whether synchronization scope is between threads.
- auto SSID = getAtomicSyncScopeID(I);
- if (!SSID)
- return false;
- if (isa<LoadInst>(I) || isa<StoreInst>(I))
- return *SSID != SyncScope::SingleThread;
- return true;
-}
-
void ThreadSanitizer::InsertRuntimeIgnores(Function &F) {
InstrumentationIRBuilder IRB(&F.getEntryBlock(),
F.getEntryBlock().getFirstNonPHIIt());
@@ -495,8 +1075,7 @@ void ThreadSanitizer::InsertRuntimeIgnores(Function &F) {
}
}
-bool ThreadSanitizer::sanitizeFunction(Function &F,
- const TargetLibraryInfo &TLI) {
+bool ThreadSanitizer::sanitizeFunction(Function &F) {
// This is required to prevent instrumenting call to __tsan_init from within
// the module constructor.
if (F.getName() == kTsanModuleCtorName)
@@ -545,6 +1124,11 @@ bool ThreadSanitizer::sanitizeFunction(Function &F,
chooseInstructionsToInstrument(LocalLoadsAndStores, AllLoadsAndStores, DL);
}
+ if (ClUseDominanceAnalysis && DT && PDT && AA && LI) {
+ DominanceBasedElimination DBE(AllLoadsAndStores, *DT, *PDT, *AA, *LI);
+ DBE.run();
+ }
+
// We have collected all loads and stores.
// FIXME: many of these accesses do not need to be checked for races
// (e.g. variables that do not escape, etc).
@@ -826,4 +1410,4 @@ int ThreadSanitizer::getMemoryAccessFuncIndex(Type *OrigTy, Value *Addr,
size_t Idx = llvm::countr_zero(TypeSize / 8);
assert(Idx < kNumberOfAccessSizes);
return Idx;
-}
+}
\ No newline at end of file
diff --git a/llvm/test/Instrumentation/ThreadSanitizer/dominance-elimination.ll b/llvm/test/Instrumentation/ThreadSanitizer/dominance-elimination.ll
new file mode 100644
index 0000000000000..acc583c078111
--- /dev/null
+++ b/llvm/test/Instrumentation/ThreadSanitizer/dominance-elimination.ll
@@ -0,0 +1,540 @@
+; RUN: opt -passes=tsan -tsan-use-dominance-analysis < %s -S | FileCheck %s
+
+; This file contains tests for the TSan dominance-based optimization.
+; We check that redundant instrumentation is removed when one access
+; dominates/post-dominates another, and is NOT removed when the path between
+; them is "dirty".
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+
+; --- Global variables for testing (minimized to two) ---
+ at g1 = common global i32 0, align 4
+ at g2 = common global i32 0, align 4
+
+; --- External Function Declarations for Tests ---
+declare void @some_external_call()
+declare void @safe_func() #0
+declare void @external_check()
+
+; =============================================================================
+; INTRA-BLOCK DOMINANCE TESTS
+; =============================================================================
+
+define void @test_intra_block_write_write() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_intra_block_write_write
+; CHECK: call void @__tsan_write4(ptr @g1)
+; The second write is dominated and should NOT be instrumented.
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+define void @test_intra_block_write_read() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %val = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_intra_block_write_read
+; CHECK: call void @__tsan_write4(ptr @g1)
+; The read is dominated and should NOT be instrumented.
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+define void @test_intra_block_read_read() nounwind uwtable sanitize_thread {
+entry:
+ %val1 = load i32, ptr @g1, align 4
+ %val2 = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_intra_block_read_read
+; CHECK: call void @__tsan_read4(ptr @g1)
+; The second read is dominated and should NOT be instrumented.
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; PATH CLEARNESS TESTS
+; =============================================================================
+
+define void @test_path_not_clear_call() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ call void @some_external_call()
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_path_not_clear_call
+; CHECK: call void @__tsan_write4(ptr @g1)
+; An unsafe call makes the path dirty. Optimization must NOT trigger.
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+define void @test_path_clear_safe_call() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ call void @safe_func()
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_path_clear_safe_call
+; CHECK: call void @__tsan_write4(ptr @g1)
+; A safe intrinsic call should not block the optimization.
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; INTER-BLOCK DOMINANCE TESTS
+; =============================================================================
+
+define void @test_inter_block_dom(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br i1 %cond, label %if.then, label %if.else
+if.then:
+ store i32 2, ptr @g1, align 4
+ br label %if.end
+if.else:
+ store i32 3, ptr @g1, align 4
+ br label %if.end
+if.end:
+ ret void
+}
+; CHECK-LABEL: define void @test_inter_block_dom
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: if.then:
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: if.else:
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; POST-DOMINANCE TESTS
+; =============================================================================
+
+define void @test_post_dom(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ br i1 %cond, label %if.then, label %if.else
+if.then:
+ store i32 2, ptr @g1, align 4
+ br label %if.end
+if.else:
+ store i32 3, ptr @g1, align 4
+ br label %if.end
+if.end:
+ store i32 4, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_post_dom
+; CHECK: if.then:
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: if.else:
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: if.end:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; ALIAS ANALYSIS TESTS
+; =============================================================================
+
+; Simple alias analysis: no alias.
+define void @test_no_alias() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ store i32 2, ptr @g2, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_no_alias
+; CHECK: call void @__tsan_write4(ptr @g1)
+; Different addresses. The optimization must NOT trigger.
+; CHECK: call void @__tsan_write4(ptr @g2)
+; CHECK: ret void
+
+; MustAlias via zero-index GEP (should eliminate)
+define void @alias_mustalias_gep0() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %p = getelementptr i32, ptr @g1, i64 0
+ %v = load i32, ptr %p, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_mustalias_gep0
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; Different offsets within the same base object (should NOT eliminate)
+ at arr = common global [5 x i32] zeroinitializer, align 4
+define void @alias_different_offsets() nounwind uwtable sanitize_thread {
+entry:
+ %p0 = getelementptr [5 x i32], ptr @arr, i64 0, i64 0
+ %p1 = getelementptr [5 x i32], ptr @arr, i64 0, i64 1
+ store i32 1, ptr %p0, align 4
+ store i32 2, ptr %p1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_different_offsets
+; CHECK: call void @__tsan_write4(
+; CHECK: call void @__tsan_write4(
+; CHECK: ret void
+
+; Equal offsets within the same base object (should eliminate)
+define void @alias_same_offsets() nounwind uwtable sanitize_thread {
+entry:
+ %p0 = getelementptr [5 x i32], ptr @arr, i64 0, i64 1
+ %p1 = getelementptr [5 x i32], ptr @arr, i64 0, i64 1
+ store i32 1, ptr %p0, align 4
+ store i32 2, ptr %p1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_same_offsets
+; CHECK: call void @__tsan_write4(
+; CHECK-NOT: call void @__tsan_write4(
+; CHECK: ret void
+
+
+; MayAlias via phi of two globals (should NOT eliminate)
+define void @alias_mayalias_phi(i1 %c) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br i1 %c, label %A, label %B
+A:
+ br label %join
+B:
+ br label %join
+join:
+ %p = phi ptr [ @g1, %A ], [ @g2, %B ]
+ %v = load i32, ptr %p, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_mayalias_phi
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: join:
+; CHECK: call void @__tsan_read4(
+; CHECK: ret void
+
+; Pointer round-trip via ptrtoint/inttoptr (typically breaks MustAlias)
+; (should NOT eliminate)
+define void @alias_ptr_roundtrip() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %i = ptrtoint ptr @g1 to i64
+ %p2 = inttoptr i64 %i to ptr
+ %v = load i32, ptr %p2, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_ptr_roundtrip
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: call void @__tsan_read4(
+; CHECK: ret void
+
+; Bitcast-based MustAlias (i32* <-> i8*) (should eliminate)
+define void @alias_bitcast_i8_i32() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %p8 = bitcast ptr @g1 to ptr
+ %v = load i32, ptr %p8, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_bitcast_i8_i32
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; GEP with folded zero offset is MustAlias: (%n - %n) -> 0
+define void @alias_gep_folded_zero(i64 %n) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %t = sub i64 %n, %n
+ %p = getelementptr i32, ptr @g1, i64 %t
+ %v = load i32, ptr %p, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_gep_folded_zero
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+define void @alias_select_same_ptr(i1 %c) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ %p = select i1 %c, ptr @g1, ptr @g1
+ %v = load i32, ptr %p, align 4
+ ret void
+}
+; CHECK-LABEL: define void @alias_select_same_ptr
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; BRANCHING WITH MULTIPLE PATHS (one path dirty)
+; =============================================================================
+
+; Case A: inter-BB with a diamond where one branch is dirty.
+; Path entry -> then (unsafe) -> merge, and entry -> else (safe) -> merge.
+define void @multi_path_inter_dirty(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br i1 %cond, label %then, label %else
+
+then:
+ call void @some_external_call()
+ br label %merge
+
+else:
+ call void @safe_func()
+ br label %merge
+
+merge:
+ %v = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @multi_path_inter_dirty
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; Dirty along one path => must instrument at merge.
+; CHECK: merge:
+; CHECK: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; Case B: inter-BB where both branches are safe (no dangerous instr). Should eliminate.
+define void @multi_path_inter_clean(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br i1 %cond, label %then, label %else
+
+then:
+ call void @safe_func()
+ br label %merge
+
+else:
+ call void @safe_func()
+ br label %merge
+
+merge:
+ %v = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @multi_path_inter_clean
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; Both paths clean => dominated read at merge should be removed.
+; CHECK: merge:
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; MIXED: intra-BB safe suffix vs. inter-BB dirty path
+; =============================================================================
+define void @mixed_intra_inter(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ ; intra-BB suffix between store and next store is safe (no calls)
+ store i32 1, ptr @g1, align 4
+ store i32 2, ptr @g1, align 4
+ br i1 %cond, label %dirty, label %clean
+
+dirty:
+ ; dangerous call on one path
+ call void @some_external_call()
+ br label %merge
+
+clean:
+ ; safe on other path
+ call void @safe_func()
+ br label %merge
+
+merge:
+ ; must keep because one incoming path is dirty
+ store i32 3, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @mixed_intra_inter
+; First store instruments.
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; Second store in same BB is dominated by the first and safe => removed.
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; Final store must remain due to dirty path.
+; CHECK: merge:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; POST-DOM with dirty suffix at start BB blocks elimination (renamed BBs)
+; =============================================================================
+define void @postdom_dirty_start_suffix(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ ; Initial write
+ store i32 1, ptr @g1, align 4
+ ; Dirty suffix in the start block blocks elimination
+ call void @some_external_call()
+ br i1 %cond, label %path_then, label %path_else
+
+path_then:
+ br label %merge
+
+path_else:
+ br label %merge
+
+merge:
+ ; Despite post-dominance, path is not clear due to dirty suffix in entry
+ %v = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @postdom_dirty_start_suffix
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: call void @some_external_call()
+; CHECK: merge:
+; CHECK: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; DIRTY PREFIX IN END BB blocks elimination (prefixSafe)
+; =============================================================================
+define void @dirty_prefix_in_end_bb() nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br label %end
+
+end:
+ ; Dirty prefix in the end block before the target access
+ call void @some_external_call()
+ %v = load i32, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @dirty_prefix_in_end_bb
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: end:
+; CHECK: call void @some_external_call()
+; CHECK: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; IRRELEVANT DIRTY PATH NOT REACHING EndBB should not block elimination
+; =============================================================================
+define void @dirty_unrelated_cone(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ store i32 1, ptr @g1, align 4
+ br i1 %cond, label %to_end, label %to_dead
+
+to_end:
+ br label %end
+
+to_dead:
+ ; Dirty path that does NOT reach %end at all
+ call void @some_external_call()
+ br label %dead
+
+dead:
+ ret void
+
+end:
+ %v = load i32, ptr @g1, align 4
+ ret void
+}
+; The dirty path is outside the cone to %end, so read can be eliminated.
+; CHECK-LABEL: define void @dirty_unrelated_cone
+; CHECK: entry:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: end:
+; CHECK-NOT: call void @__tsan_read4(ptr @g1)
+; CHECK: ret void
+
+; =============================================================================
+; POST-DOMINANCE WITH LOOP
+; =============================================================================
+define void @postdom_loop() nounwind uwtable sanitize_thread {
+entry:
+ br label %while.cond
+
+while.cond: ; preds = %while.body, %entry
+ %call = call i32 (...) @external_check()
+ %tobool = icmp ne i32 %call, 0
+ br i1 %tobool, label %while.body, label %while.end
+
+while.body: ; preds = %while.cond
+ store i32 1, ptr @g1, align 4
+ br label %while.cond
+
+while.end: ; preds = %while.cond
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; It's a potentially infinite loop,
+; so the store in while.end should not be eliminated.
+; CHECK-LABEL: define void @postdom_loop
+; CHECK: while.body:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: while.end:
+; CHECK: call void @__tsan_write4(ptr @g1)
+
+; =============================================================================
+; POST-DOMINANCE BLOCKED BY POTENTIAL INFINITE LOOP (CallBase check)
+; =============================================================================
+
+declare void @dom_safe_but_postdom_unsafe() #0
+
+define void @test_post_dom_blocked_by_readnone_call(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ br i1 %cond, label %if.then, label %if.else
+if.then:
+ ; dominated by neither entry nor if.end (in terms of domination elimination flow)
+ ; but post-dominated by if.end
+ store i32 1, ptr @g1, align 4
+ call void @dom_safe_but_postdom_unsafe()
+ br label %if.end
+if.else:
+ br label %if.end
+if.end:
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_post_dom_blocked_by_readnone_call
+; CHECK: if.then:
+; The call is readnone (safe for sync) but not an intrinsic (unsafe for termination).
+; So the first write must NOT be eliminated (post-dominance is blocked).
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: call void @dom_safe_but_postdom_unsafe()
+; CHECK: if.end:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+declare void @postdom_safe_func() #1
+
+define void @test_post_dom_allowed_by_postdome_safe_func(i1 %cond) nounwind uwtable sanitize_thread {
+entry:
+ br i1 %cond, label %if.then, label %if.else
+if.then:
+ store i32 1, ptr @g1, align 4
+ call void @postdom_safe_func()
+ br label %if.end
+if.else:
+ br label %if.end
+if.end:
+ store i32 2, ptr @g1, align 4
+ ret void
+}
+; CHECK-LABEL: define void @test_post_dom_allowed_by_postdome_safe_func
+; CHECK: if.then:
+; CHECK-NOT: call void @__tsan_write4(ptr @g1)
+; CHECK: call void @postdom_safe_func()
+; CHECK: if.end:
+; CHECK: call void @__tsan_write4(ptr @g1)
+; CHECK: ret void
+
+; Attributes for the "safe" function
+attributes #0 = { nosync }
+attributes #1 = { nosync willreturn nounwind }
More information about the llvm-commits
mailing list