[compiler-rt] [ASan] Prevent ASan/LSan deadlock by preloading modules before error reporting (PR #131756)
via llvm-commits
llvm-commits at lists.llvm.org
Sat Mar 22 01:28:41 PDT 2025
================
@@ -126,6 +126,27 @@ class ScopedInErrorReport {
public:
explicit ScopedInErrorReport(bool fatal = false)
: halt_on_error_(fatal || flags()->halt_on_error) {
+ /*
+ * Deadlock Prevention Between ASan and LSan
+ *
+ * Background:
+ * - The `dl_iterate_phdr` function requires holding libdl's internal lock (Lock A).
+ * - LSan acquires the ASan thread registry lock (Lock B) *after* calling `dl_iterate_phdr`.
+ *
+ * Problem Scenario:
+ * When ASan attempts to call `dl_iterate_phdr` while holding Lock B (e.g., during
+ * error reporting via `ErrorDescription::Print`), a circular lock dependency may occur:
+ * 1. Thread 1: Holds Lock B → Requests Lock A (via dl_iterate_phdr)
+ * 2. Thread 2: Holds Lock A → Requests Lock B (via LSan operations)
+ *
+ * Solution:
+ * Proactively load all required modules before acquiring Lock B. This ensures:
+ * 1. Any `dl_iterate_phdr` calls during module loading complete before locking
+ * 2. Subsequent error reporting avoids nested lock acquisition patterns
+ * 3. Eliminates the lock order inversion risk between libdl and ASan's thread registry
+ */
+ Symbolizer::GetOrInit()->GetRefreshedListOfModules();
----------------
Camsyn wrote:
This is tricky, but if `dlopen`/`dlclose` occurs later in other threads, it merely degenerates the code to behave as if it hadn’t been modified—in other words, loading the modules (acquiring lock A) under lock B. We must prevent such degeneration.
I can think of three fixes.
Option 1:
During error reporting, block other threads from updating the `Symbolizer` (which is acceptable since the module required must be loaded before the error reporting). This can be achieved by holding the `StaticSpinMutex Symbolizer::init_mu_` to prevent other threads from invoking `Symbolizer::GetOrInit()->InvalidateModuleList()` in the `dlopen`/`dlclose` interceptor.
The problem with this approach is that 1) `Symbolizer` does not expose the `init_mu_` interface—it is currently only used internally by `GetOrInit()`. Moreover, 2) holding it for an extended period during error reporting is not an appropriate use of a spin lock.
Option 2:
Similar to how LSan resolves its deadlock issue, hold `libdl`’s lock for the entire duration of error reporting. This would prevent other threads from performing `dlopen` and `dlclose`. Since the `libdl` lock is reentrant, loading modules during the report wouldn’t be an issue. In effect, this adjusts the lock acquisition order in ASan error reporting to be consistent with LSan—acquiring lock B while holding lock A.
Option 3:
Modify the logic in `Symbolizer::FindModuleForAddress` so that it first directly searches the currently loaded modules, and only checks if `RefreshModules` is needed when the address is not found in the current Module List. This tackles the issue because the addresses required for error reporting must belong to modules that were loaded before the report.
---
Note that although reversing the lock order in LSan could also resolve the deadlock issue, it is not feasible because it would introduce new deadlock problems (other threads might perform thread operations or memory allocations (use lock B) within `dl_iter_phdr` (lock A) ).
https://github.com/llvm/llvm-project/pull/131756
More information about the llvm-commits
mailing list