[compiler-rt] [ASan] Prevent ASan/LSan deadlock by preloading modules before error reporting (PR #131756)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 2 02:03:39 PDT 2025
Camsyn wrote:
### Proposed Fixes for ASan-LSan Deadlock Issue
After analyzing the ASan/LSan code and commit history, I confirm that concurrent error reporting from different threads should not be considered (it has already been handled by `ScopedErrorReportLock`). Below are the potential solutions (to my knowledge) :
---
#### **1. Maintain Current Patch (Incomplete)**
- **Approach**: Keep the existing patch to preload modules in ASan, avoiding holding Lock A (libdl) under Lock B (ASan thread lock).
- **Risk**: If other threads modify the module list via `dlopen`/`dlclose` during this window, the patch may regress to the original deadlock scenario. While rare (due to the low probability of overlapping LSan checks and ASan reports + module updates), this is not a complete fix.
- **Supplemental Idea**: Add synchronization to make `dlopen`/`dlclose` wait until error reports complete before updating module lists.
- **Rationale**: Error reports depend on module list, and the "buggy" module triggering the report should logically exist stably before the report. Should the module list be frozen during error reporting?
---
#### **2. Revert LSan Lock Order (Incomplete)**
- **Approach**: Move LSan's acquisition of Lock B (thread lock) *before* `dl_iterate_phdr` (Lock A). Historically, LSan acquired Lock B before `dl_iterate_phdr`, but commit f1b6bd4 moved it inside `dl_iterate_phdr` to handle potential `malloc` calls during iteration.
- **Key Question**: Since ASan does not hold the allocator lock during error reporting, could reverting Lock B acquisition to pre-`dl_iterate_phdr` resolve the deadlock?
- **Risk**: Thread creation within `dl_iterate_phdr` (now holding Lock B) might reintroduce deadlocks similar to those addressed in `f1b6bd4`.
---
#### **3. Align ASan with LSan (Complete)**
- **Approach**: Make ASan acquire Lock A (via `dl_iterate_phdr`) before error reporting, mirroring LSan's lock order.
- **Challenge**: Requires modifying numerous `ReportXXXX` functions to explicitly handle Lock A, which may introduce code bloat and maintenance overhead.
- **Tradeoff**: Ensures consistent lock hierarchy (A→B) but complicates ASan's reporting logic.
---
#### **4. Leverage `ScopedErrorReportLock` for LSan (Complete)**
- **Approach**: Have LSan perform `dl_iterate_phdr` while holding `ScopedErrorReportLock`, preventing concurrent ASan error reports.
- **Consideration**: While LSan leak checks do not inherently require error reporting locks, this would enforce mutual exclusion between leak scans and ASan reports.
- **Open Question**: Is it appropriate to tie leak-check synchronization to error reporting locks?
---
### Key Tradeoffs
| Solution | Completeness | Risk | Code Impact |
| ------------------- | ------------ | -------------------------------- | ----------- |
| 1 | Partial | Regression risk | Low |
| 1 (with supplement) | Full | Code changes in `sanitizer_common` | Moderate |
| 2 | Partial | New deadlock scenarios | Moderate |
| 3 | Full | High maintenance cost | High |
| 4 | Full | Logical coupling concerns | Low |
---
Which fix should we choose or are there any other suggestions?
https://github.com/llvm/llvm-project/pull/131756
More information about the llvm-commits
mailing list