<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/83844>83844</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
TSan: async signals are never being delivered when the target thread is blocked waiting for a `FUTEX_WAIT` syscall
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
canova
</td>
</tr>
</table>
<pre>
This issue is very similar to #83561, but the underying problem has different API calls, so I think it makes sense to create another issue for tracking.
Here are my test cases:
- C++ test case: https://godbolt.org/z/sofbK5bKq
- Rust test case: https://github.com/canova/rustc-tsan-testcase (compiler explorer doesn't let you use libraries in nightly)
This was initially found in Firefox in the Rust codebase. That's why I wanted to create a test case for both Rust and C++.
It looks like tsan delays dispatching signals until it finds a blocking function. In Firefox, we have a sampling profiler that sends `SIGPROF` signals every interval and we have a mechanism to wait for the `SIGPROF` signal to finish some work with semaphores. This example test cases above describe our situation.
This time it hangs during `FUTEX_WAIT` `syscall` because TSan doesn't know that this is a blocking call. Again there are 2 threads involved, "main thread" and the "thread 1". Main thread sets up the profiler signal handler, locks the futex, and creates the 'thread 1". "thread 1" starts to wait for the futex with `Mutex.lock()` and `syscall(SYS_futex, uaddr, FUTEX_WAIT_PRIVATE...)`. Then we send the signal from the main thread using `pthread_kill`. Normally without TSan, `SigprofHandler` gets executed and then it wakes the futex with `syscall(SYS_futex, uaddr, FUTEX_WAKE_PRIVATE...)`. After this, "thread 1" gets unblocked and exits. But with TSan `SigprofHandler` never gets executed because TSan never thinks that "thread 1" executes a blocking call.
### Normal execution:
| | Main thread | Thread 1 |
|---|--------------------------------------------|--------------------------------------------|
| 1 | `Mutex.lock()` (with syscall `FUTEX_WAIT`) | |
| 2 | | `Mutex.lock()` (with syscall `FUTEX_WAIT`) |
| 3 | `pthread_kill(...SIGPROF)` | |
| 4 | | `mutex.unlock()` inside `SigprofHandler` |
| 5 | | `mutex.lock()` inside `Thread1` |
| 6 | | Exits |
| 7 | `pthread_join` | |
| 7 | (done) | |
### TSan execution:
| | Main thread | Thread 1 |
|---|--------------------------------------------|------------------------------------------------------------------------------------------|
| 1 | `Mutex.lock()` (with syscall `FUTEX_WAIT`) | |
| 2 | | `Mutex.lock()` (with syscall `FUTEX_WAIT`) |
| 3 | `pthread_kill(...SIGPROF)` | |
| 4 | | (SigprofHandler never gets executed because <br>TSan thinks we are not in a blocking call) |
| 5 | (deadlock) | (deadlock since futex never unlocks) |
I believe this can be solved similar to #83561, but I'm not so sure how yet as this is a syscall. There should be a way to intercept syscalls I assume?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEV9uO47gR_Rr6pdCCTPn64AfPbDtrDDYZTHduTwNKKllcU6TDoux2vj4oSr61exs9WSQrDDQtq1iXc05RRUWkNxZxIcafxPingWpD7fyiUNbt1SB35XHxXGsCTdQiaII9-iOQbrRRHoIDIbNZNp4MhfwMeRsg1AitLdEftd3AzrvcYAO1Iih1VaFHG2D5dQ2FMoZ4ETlYQ6i13YIO0KgtEhBaQvZeeFQBQVkXavR9FpXzELwqttpuEpH-JNJld_8ZPYLyCM0RAlKAQhGSyPrXD_BZyE9Cfrq8FNkS6hB20UiuhFxtXJk7ExLnN0Ku_i3kilyVfxnnX_518vKtpfCuCx3qNk8K1wi56pAUcuVbCsVDIGUfeC0vBSFnhWt22qAHfNkZ59FD6ZCskNMABgMcXQstIRide-U1EmgLVm_qYI5Czq_Lj0QdFFvooJUxR6hca0tesdIeK_fCfzJDsYTClZgrwgSeaxWEnBIc6iOs4aBswPIa_0u5Ef3chbrzoWx5QvWGinUA49yWwOgtAlcNJRp1ZBnQToWiZnmw9pQhaG3QhumvtC0JFOTGRXqham0RtLMJrM9FsGoOCLXac2akmp3ptVZFJEOtAkuoJBCT9Gn9p6_f_rISk_QcDqOItQ3o98rEGi7-GixqZTU1XP9BcVKstxrfdMZGlbaaaiDXIByc38JBhxoIG7WrnUdifDUBvnCmeKVMULnbI5RIhdc5gms9kA6tihXfURt0gwxSreyGoGw9Vy0m6eqvz4__-P735fqZ8xKTlI7E7cVPORaK5fP8xAyclbW17tDhFLruvsac1yaw3KhOLH1PSQi1R1WyvPbO7LFkHoSUTWfH74SUEc2IlpTdjzAUUibwy8UMCANBu4t2Z9p6QGtlS4OenXNCFI2qNmDknb13oqQ-yvQmym1UoKB8oDsmo7uOJjFJf-GnhIMJOeOemqQxzhWScvb0z6fv5yxaVZYxwwv0379-W_9t-fyYJEnngllHy8JiKcawfYWVd018vkIOWurZ3HU_fN_qyGACf3a-ic3M-bo2RC4j9pP0SW8Yv597zCYpbBhafMGi5RbuybAsm0PcWu_L_2iNXx7fqHFZhdhxmno1XKMfc2ltFFafDL7oQAl8akMXP-ryzUIsd-mrcm7U3BnELwd1Wn4Vvl92r-3rzurvMuv-9Wj3a7Wz56-HmH4GAL5fK5mfn_uY_HC2fXh46O4fvn7c_JJYjP1bWhZy1u1IHdF3W4aQ8275tUcJXcEfvn5_Apfg2cnbTTPIWZIkpx248_xmEteeRv9lGU0so7W3hWhLusS39Qq3gcenXC7efstXp5_hq3puvE3O3h65gW5fTl-j9avT9i103kWq9yJnpbN4ksSPAfd2R8Vm_Wg_fYCdP6bdfuz6vzTn7_T6Y9f_tD975d021btfAJF9zr3IHqO4-m_AoRtUrAs85r7a9O_2mPFZ76jKDr3565-AtC1O38sunW5LoFt3d_c15Gg07rGbrwplIUegODa9d3paCzltYgXkgFqPULsDHDGAoqtRrac1jhgegWrXGsYGFBzUkR3H4bbAXTjZEqxBEbUNimw1KBdZOc_maoCL4TSdj8fzyXg4qBdZJdOynMh5UWXjvBhPR8NhOhlOsZTFbIjFQC9kKkdplo6G2Sgdp8m8VEWW5bNqmI9mVTYVoxQbpU1izL7hU9QgHtoWs2w2Gg2MytHQ6bDpF2z0kLcbEqPUaAp0WRZ0MLiIs062BEVHW5wH-EhzpCNHJrhEo_fosYQDDzs84wTlNxhO24omOE0hPAjGk4XzoO7H5x6uQevN4p1jHafZ__ew8-5XLIKQq1gqCbmK1f4nAAD__3pWhws">