<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/66537>66537</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            TSAN false negative on second `thread::join` if first `thread::json` throws
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            compiler-rt:tsan
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          huixie90
      </td>
    </tr>
</table>

<pre>
    Given the following problem

```cpp
  {
    std::function<void()> f;
    std::atomic_bool start = false;
    std::atomic_bool done  = false;

    std::jthread jt{[&] {
      start.wait(false);
 f();
      done = true;
      done.notify_all();
    }};

 f = [&] {
      try {
        jt.join();
        assert(false);
 } catch (const std::system_error& err) {
        assert(err.code() == std::errc::resource_deadlock_would_occur);
      }
    };
    start = true;
    start.notify_all();
    done.wait(false);
 }
```

The `jt.join()` would throw and the exception would be caught. Later at the end of the scope, the destructor of `jthread` would call `join` the second time. This is valid code but TSAN complains

```
ThreadSanitizer: CHECK failed: sanitizer_thread_registry.cpp:348 "((t)) != (0)" (0x0, 0x0) (tid=3411214)
    #0 __tsan::CheckUnwind() <null> (t.tmp.exe+0xcbd1b) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
    #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) <null> (t.tmp.exe+0x45042) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
    #2 __sanitizer::ThreadRegistry::ConsumeThreadUserId(unsigned long) <null> (t.tmp.exe+0x43e94) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
 #3 pthread_join <null> (t.tmp.exe+0x604bb) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
    #4 std::__1::__libcpp_thread_join[abi:v170000](unsigned long*) /home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-69f521df8409-1/llvm-project/libcxx-ci/build/generic-tsan/include/c++/v1/__threading_support:398:10 (libc++.so.1+0x73358) (BuildId: 71b8f06279b5e8117756a82c1e642e312c2a0e30)
 #5 std::__1::thread::join() /home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-69f521df8409-1/llvm-project/libcxx-ci/libcxx/src/thread.cpp:51:14 (libc++.so.1+0x73358)
    #6 std::__1::jthread::join[abi:v170000]() /home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-69f521df8409-1/llvm-project/libcxx-ci/build/generic-tsan/include/c++/v1/__thread/jthread.h:91:49 (t.tmp.exe+0xed5f9) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
 #7 std::__1::jthread::~jthread[abi:v170000]() /home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-69f521df8409-1/llvm-project/libcxx-ci/build/generic-tsan/include/c++/v1/__thread/jthread.h:60:7 (t.tmp.exe+0xed708) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
 #8 main /home/libcxx-builder/.buildkite-agent/builds/google-libcxx-builder-69f521df8409-1/llvm-project/libcxx-ci/libcxx/test/std/thread/thread.jthread/join.pass.cpp:108:3 (t.tmp.exe+0xe836d) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
 #9 __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 (libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
 #10 __libc_start_main csu/../csu/libc-start.c:392:3 (libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #11 _start <null> (t.tmp.exe+0x30484) (BuildId: d7fd9285e8961410872ba6334aecc0115d0ff192)
```
What might be happening is that TSAN instruments `pthread_join` and it thinks that the `thread` has already been joined when you call it the second time in the destructor. But in reality, the thread wasn't joined the first time around because of the system error. 

I had a looked at TSAN's code

```cpp
TSAN_INTERCEPTOR(int, pthread_join, void *th, void **ret) {
  SCOPED_INTERCEPTOR_RAW(pthread_join, th, ret);
  Tid tid = ThreadConsumeTid(thr, pc, (uptr)th);
  ThreadIgnoreBegin(thr, pc);
 int res = BLOCK_REAL(pthread_join)(th, ret);
  ThreadIgnoreEnd(thr);
 if (res == 0) {
    ThreadJoin(thr, pc, tid);
  }
  return res;
}
```

`ThreadConsumeTid` calls `ConsumeThreadUserId`, which

try to find by id, assert id is found
remove what is found
>From what I can see, the first time `join` is called, the user id is removed. then `join` throws.
The second time `join` is called, the user id is no longer there and hence assertion failure.




</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzcWEtv4zgS_jXMhYhAkXoefHCceDc7g-lBdwZzFGiyZDEtkwZJJfEe9rcvSMnxs9OLRYDBLmDYMh9VX32sKlWRO6fWGmCG8juU39_wwXfGzrpBvSmoyc3KyN3sb-oFNPYd4Nb0vXlVeo231qx62CByj8h8-i7I-BHb7TiCMSrv9o8YOy8RmyM2bwctvDIascWLURLRCtEasQfcInZtPfdmo0SzMqbHznPrMWL3uOW9g59vkEYDvrLhyrZn31ngEj_7gDu_Q7RA-f2pEXhEkLxy5RGtRpkB_X5NuzfnZFNEEUB4O8CVuUQbr9pdw_v-yn5U3ofPGfQ2CvwhTm9350MYP_vk2Sh9FSPG3Dmw181C5T0W3IsOI1oJo50_8OZ2zsOmAWuNRbTAYC2i9aXyd_FgbSKMhBFGMCNY8i4PrBXjkwVnBiugkcBlb8T35tUMvWyMEIO9tCCQdELaqXfsPefiDMYj_fAI4iH98NDfFb8HwfFJPXWAUUFOuS8IjrZg31nzirmWMcTgTcA2BMc0uwIs-LDufIJ_5R4s5n5cpyU2bXx0wmwB0UX8I8F5OwhvbJiOWqNTH_QJ3vdxImApyCgChAkA1AYS_NQph5XDL7xXEodzwqvB46dv89-wMJttz5V2VyN_b25Q-I1r5dU_wSI2x4u_Pyx-wS1XPYQzxm4_2YzoGgtr5bzdJSF5sDnLKowojVRVPtAV3ISm0eFpReIIjY9vJJgef8KSyiuJ2D3L0pSmWVh3cAjKCG4a77gevWvRgfj-h35VWr574kIPfR9yURCV-M02gTdA9I68iZVMV5OSu0H18jGaIstW1rTKoaqLNEtJVdIVLxjLOAhB0jSXpG3Tmp5DSXHTuCOO9niWI0m0Eh23OEYaovNgo9I-_FyMDzomcYl7o9fx60ejHxuY5SSjn2cgvTRw9Iyv01lPRhvthg2MU384sI_B-BP0PwXOoM4-AziijOHt5JIhPj5UW5Bs9YkOkR0SYNOk-4dercR22xxhQvkdXynE5i9pSQghKL-_JGw-4lp2ZgOILoOUt7fbVUAJFtFlEh-_Kw-3fA3BsZZxxCG6XBuz7uH2dM9tUbc5TWVbZaS-TYPM_mVzu7XmGYQ_qBBqLypIAg1WidsYcnSptOiHkPWXAtG7-Fm-BFHNZJ_S68YN262xPiSBukJsnpLAb5A-7kmcSdLIf8lYXl3yX6arqiUFLetVDlWalmVe8IqKFIqMAkupoJwAIyfHnl8hf0qcY2nwnrj_OlbHZ0SXzgpElyO8KWHmAXOa_YyqE4crrtj8fGH0VW_7n_QuRJeTeUmH2LwOJmf1ZVyDzNv6k9JJ-TOO_7X_93_Lc0EQm5fXaC7JlfD9r2iu8IaHbP2Xh6aH8FpehkPfR-ghVJ8P_Bilky13bgrflIRUx66QVLFCfhJJNR5fJ00sd5tQBzaRNuGGwFASgO-chG0gRW99P1l2viEeax6Tc7HPOCHXFBEyrWXVXkIualbVMqtyXpc1kyuoStYS4FTUDAiFlvOacyZPIKfkFPM53PExLLgda3gR3xt0T-Y5MmCfgmyq4FLc7JuKH1cJjGTVZxQnZ0X2nx33eKPWnQ8tQse3W9ChMVcO-zAVq3WlQzOwAe1dqPiPK5tQ-YeeQ4V2Qunv0zY_tiqHnqHjDvM-_NvhFYDGYTNI_NqBxjszjO2E8udtBFb6rB9J8N3gw7AF3iu_23csU9P9yp1GtPR7BfG-QVnnR3HcmkGHdkjwwcF76xM7Txw7zwQf9ySPuOMSc9wb8x0knhhBtHSxo_n44iIsbR5_e3r4unj4_enLV0Srqfg-oZAu8ItREiM6993xP0TnFvxZD_xt8eX3h_tjsc3X-Z-IVucyR1mjgKMe9EkFZmXsYMdqeV86xysU39kIUITvUBBufeiQg7ATKXHn41obC3ewjlXN0dajpUp7bMFFfXe_fln80nx9mP96gbeOAq4jPtL1oN9BHutoA9RJS1BELi4ORiH_MOdIF4GMU31H7b8FP9jgau5wa_Jhj44KckFqQaJ3x9i51qYUsfV87ZTojkV5u8Pe4FYFf93hgHIx3X1gJUOEtsGXx8UWNuYF8GsIvtOZpTWbcfwRC66xg_cm_ygujhp55SJckPtlgwM7aRzVyCSM69Pu35pXlxwuKo5j-D8Urk3sOsCGCQsxr3SgBUxWK6Nj7z9YSE44P_q-kTMma1bzG5ilRZ1VJEtZcdPN2qyVhEnIRFXXLZMprEoGeUFSVmatZDdqRgllpE7ztMpzlidlW8ssrUSeVaKu8wplBDZc9Ul4gSfGrm-UcwPMiiJn5U3PV9C7eAFKqTCbrerB3sYGZCx4KMrvb-wsvv1Xw9qhjPTKeXcQ55XvYRYzbrwawhrW3KsXwEbv-Twk1UNpHWhtp8M8n3fm6HhuBtvPOu-3LszSZahZlO-GVSLMZqpMLguUaGV4lUdD_x0AAP__kyCA8Q">