<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/83561>83561</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
TSan: async signals are never being delivered when the target thread is blocked waiting for a mutex lock
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
canova
</td>
</tr>
</table>
<pre>
Here's my test case: [Compiler explorer](https://godbolt.org/z/8jszn545v)
It looks like llvm delays dispatching signals until it finds a blocking function. In Firefox, we have a sampling profiler that sends `SIGPROF` signals every interval and we have a mechanism to wait for the `SIGPROF` signal to finish some work with semaphores. This example test case above describes our situation.
There are 2 threads, let's call them "thread 1" and "thread 2", and "thread 2" sends a `SIGPROF` signal to "thread 1" get the some information. At the end of SIGPROF handler, we post a semaphore with `sem_post` to notify that the work is done. And "thread 2" waits until it gets that semaphore.
It's visualized like this:
| | Thread 1 | Thread 2 |
|---|-------------------------------------------|--------------------------------------|
| 1 | | `lock_guard profiler_mutex` |
| 2 | `pthread_mutex_lock` for `profiler_mutex` | `pthread_kill(*thread_1, SIGPROF)` |
| 3 | `pthread_mutex_lock` for `profiler_mutex` | `sem_wait(&message)` from "thread 1" |
| 4 | (deadlock) | (deadlock) |
Without TSan, thread 1 gets the `SIGPROF` signal and captures a sample, then sends the semaphore. Then thread 2 unlocks the mutex at the end and thread 1 continues by acquiring it. So the execution happens without any deadlocks. But for TSan builds this deadlocks happen frequently.
Also, as you can see in the [Compiler explorer](https://godbolt.org/z/8jszn545v) example, it works without a problem when TSan is not enabled, but hangs and times out when it's enabled.
While I was investigating I came across this [github comment](https://github.com/google/sanitizers/issues/1179#issuecomment-571513480) that recommends changing `pthread_mutex_lock` to a `BLOCK_REAL` intead of `REAL`, which fixes this test case.
Would you be open to accepting this change? I'm happy to send a PR if so.
But also I have another test case, where "thread 1" doesn't do anything (with just an empty infinite while loop) here: [Compiler explorer](https://godbolt.org/z/ej4zaabzP)
This is even harder to fix as there is no blocking function in thread 1, and `SIGPROF` never arrives because of it. I think they can be fixed separately though.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVk1v4zwO_jXKhWhgK3E-Djmk7WY32Bd4BzMF9ljQFhOrlSWPJKdNf_2CspMm086iwGwObm3x8xEfkhiC3luilShuRXE_wi7Wzq8qtO6Ao9Kp4-pf5EnIeYDmCJFChAoDickaRHF755pWG_JAr61xnrwo7oVc1DG2QUzWQm6E3OydKp2JY-f3Qm7ehNwsnsKbLabFQcilyO5Ftu6f2wjGuecARj8TGHNoQJHBYwClQ4uxqrXdA0eMJkBnozagI-y0VQEQSuOqZ5bYdbaK2tkxbC1stKedexXyDl4IajwQIARsWsOirXe7lEGsMUIgtiRm2Y_tP799_3sjZtnZHR3IH0HbSP6ABtCqC3sNVTVaHRqIDl6Qg3Jskz41xkI7bXWoIbiG4MX5Z3jRsYZADba18xTG8FDrAPTKkdI78oClOxAoCpXXJQVwnYegY4cp40s4H2ryBOgJJMTaE6rAKBiK6T4rNIZDbEBI2Z9DLqRMqb1_kkJKVvv4dYALf5vjL3b3FBMkKWdtd843fdCw7g_IKnA7GGxBjVYZ8sPNtS5EvrkTQj1gYpYFah75kH1HB9ZFvTv298lGE7g6gHKWxrD-mAVf10U17SmGUzUMrsbXRZrQO-jQodFvpPpqjbVOJd9Lzu8AgJ8PQ_6XL5JfzoI3Nzf986u_L0tfehki-PqPpcUsY0o97jv06kyVx6aL9MpoXzmQJ5W2B7cXe2QDLMt84MPPjFxqPWtjhFwIuR4-5Hz_p_KSS1aBa8-TP_PM9cM1kLzOGgoB9zR42nn3gR5Xvqe9FblQhCp5lMsvQXut8W7zw_M_Otaui_DwAy1DcQrlVKi_aTFM1wrb2HkKp4ZHvT7ZgbmJjOcihwc-OTEDOsvB9UIJMsB3lrL1cyCVs1HbjgKUR8DqZ6c9t1Ydx_DD9SqvVHVMdaixbcmGRF7OCu0RTkCEMdx2fefkZKHstElRMntPMoMF2Hn62ZGN5njFzrUJLnWrAEfXQYWcK3ebHqn_y8g6NWX2o2NqMBcJMU9KQw28MJwpER24KwFZLA0pViu7yO1tH3okdZMaeex1dN9hBvHxJyVRa0OwhRcMoO2BQtR7jIz5FipsCLDyLgzIieJ2r2PdlVC5piEbP004SYwr16Ts3Z6z2wS0Ouo38kHIjQ6hI_4nz-dLISfpfTB5U8zzIp9MFxnDk5qnH85UAB6Ne47utxSNrp8it3_9fffvx-__WP_FX3nWYpoIYpYNH9MwqHVVw06_0pDieTiOYQDIdUalAigJHJcLe6gqahNKSSlFRWKyga2Q8ybV1ZHlQipw-PYd9A6Cu8Kf6xNNcLAdJr91sebt4bwYpfh47v7SNZSjYIWcR1COqz6mXUbIRRpjTx0PNwvUtJGXDN4OInGmhngnahlXNvtnixc9Td8Qy7dv58UrLRk6bTdMTq84Gd5OXplCMWWSyvfjdtWT6pThsCBctSLLOxOg9_rAzYEq7ALxfXJr2PI12Gf2cUw8LSndqYJALXqMZHiKu25fj0dqNVHLyRJHtMrn2VIui3y2HNWrMq92s0U2LybTYl6U0-Usw2KxnBDluFyQGumVzOQ0m2R5LnM5XY6nSpW5KneIuyrLcS6mGTWozZjXTYZqlAp7tZgUs3xksCQT0nYspaUXSIe8ERX3I79inZuy2wcxzYwOMbxbiToaWqWmPVkDhqOtzqskr2Q9NiUxooqMPpAn1fOfO1VE369LCV8devhZAnWqYe6SODRmPhp13qz-B6k5sOHPTevdE1XxktQp3f8GAAD__zk21_I">