[PATCH] D8802: [libc++] Fix PR22606 - Leak pthread_key with static storage duration to ensure all of thread-local destructors are called.

Mon Jul 6 16:47:10 PDT 2015

In http://reviews.llvm.org/D8802#187303, @EricWF wrote:

> OK. I have some new thoughts. The `__thread_struct` destructor serves three purposes when it is invoked at thread exit.
>
> 1. Notifying registered condition variables.
> 2. Making registered shared state ready.
> 3. Deleting the tuple of arguments used to start the thread.

I was incorrect in this comment. #3 is unrelated to this problem. They are always freed. Only #1 and #2 are any concern.

The problem is this:

Libc++ uses a single pthread key for thread local storage. This TLS is used by `<condition_variable>` and `<future>` to notify/ready consumers on thread exit. The notification is done in the pthread key's destructor which is executed during `pthread_exit(...)`. 
When the main thread begins program termination and a detached child thread, `t0`, has yet to terminate there is a race condition between `t0` executing the TLS destructor and the main thread destroying the static pthread key. Once the main thread destroys the pthread key no more TLS destructors are run.

There are two solutions to this race condition:

1. Leak the pthread key. If we never destroy the during program termination then all TLS destructors on detached threads will run.
2. Reference count the key and only destroy it once all TLS have been run. This change would require a MAJOR ABI break that we could not ship for some time.

Because #2 requires an ABI break I believe that #1 is the better option. Because we only leak one key, and this key is leaked during program termination, it is very unlikely (read: almost impossible) for a user to notice this change.

However, I question if is wise to attempt to run the TLS destructors at all once program termination has begun. The shared state referenced by the destructors must have static storage duration and this static storage may or may not have been destroyed already. It seems very unlikely that a well-formed program will attempt to use the results of `std::notify_at_thread_exit(...)` or `std::promise::set_value_at_thread_exit(...)` during program termination. This is likely also undefined behavior as noted in `[basic.start.term]C++17 3.6.3 p4`:

> If there is a use of a standard library object or function not permitted within signal handlers (18.10) that

> does not happen before (1.10) completion of destruction of objects with static storage duration and execution

> of std::atexit registered functions (18.5), the program has undefined behavior. [ Note: If there is a use

> of an object with static storage duration that does not happen before the object’s destruction, the program

> has undefined behavior. Terminating every thread before a call to std::exit or the exit from main is

> sufficient, but not necessary, to satisfy these requirements. These requirements permit thread managers as

> static-storage-duration objects. — end note ]

If we choose not to run the TLS destructor's once program termination has begun we should still make an effort to free all memory allocated by the internal TLS structures to placate ASAN.

In summation I believe we should take this patch as a partial fix to the above problem and continue to investigate if we should disable thread exit notifications once program termination has begun.

http://reviews.llvm.org/D8802