[libcxx-commits] [PATCH] D145183: [libc++] Implement `stop_token`

Sun Mar 12 14:25:37 PDT 2023

huixie90 added inline comments.

================
Comment at: libcxx/include/__stop_token/stop_state.h:40
+
+  static constexpr uint32_t __stop_requested_bit        = 1;
+  static constexpr uint32_t __locked_bit                = 1 << 1;
----------------
Mordante wrote:
> Can you add more documentation regarding what these fields mean.
> 
> Looking at the loop in 
> `LIBCPP_HIDE_FROM_ABI bool __add_callback(__stop_callback_base* __cb) noexcept`
> I see several flag tested and the lock is the latest to be tested.
> 
> I would expect the lock the first to be tested.
> 
yeah I will refactor/rename the code and add more comments. the `lock` bit is for locking the linked-list. The reason why it was not locked first is that the spec specified that if the `requested` bit was set, we won't go through the linked-list and instead directly invoke the callback on the current thread.

================
Comment at: libcxx/include/__stop_token/stop_state.h:184
+  _LIBCPP_HIDE_FROM_ABI bool __add_callback(__stop_callback_base* __cb) noexcept {
+    auto __current_state = __state_.load(std::memory_order_relaxed);
+    do {
----------------
Mordante wrote:
> Can you explain why this relaxed it the right order?
will add comments. all the atomic operations here won't need any memory ordering except the one actually does the lock.

================
Comment at: libcxx/include/__stop_token/stop_state.h:189
+          // already stop requested, synchronously run the callback
+          __cb->__invoke();
+          return false;
----------------
Mordante wrote:
> How do you prevent calling the callback multiple times.
once it is called, the next line would `return`. 

================
Comment at: libcxx/include/__stop_token/stop_state.h:311
+  _LIBCPP_HIDE_FROM_ABI void __increment_ref_count(__stop_state* __state) {
+    __state->__ref_count_.fetch_add(1, std::memory_order_relaxed);
+  }
----------------
Mordante wrote:
> Note my atomics knowledge is a bit rusty.
> Why is this `memory_order_relaxed` instead of `memory_order_release`?
> 
> AFAIK the current increment may not be visible in a different thread doing a decrement.
> Which may cause use-after-free and double-free issues of the __state.
Note that this is the same case for `shared_ptr` where increment is `relaxed` and decrement is `acq_rel`
https://github.com/llvm/llvm-project/blob/main/libcxx/include/__memory/shared_ptr.h#L105

The memory ordering controls the visibility of the write operations (including non-atomic operations) prio to current atomic operation, and read operations after the current atomic operation.

For increment, we don't need to do anything before and afterwards, so `memory_order_relaxed` is enough.
For decrement, we are potentially evaluating the destructor of underlying type `T`. We need to make sure that there are no other threads that are still poking around the underlying object through a different `shared_ptr` instance. i.e. any operations on the underlying object on any thread should be "finished" before evaluating the destructor.

We need `release` to tell other threads that we are done with the underlying object and all the writes to the underlying object should be visible now. (suppose you are writing to the underlying object before the shared_ptr goes out of scope, so that in case a different thread which has the last shared_ptr calls the destructor, it needs to see this write)
In case we are calling the destructor, we need `acquire` to see other threads write (see above).
So decrement needs both.

The way I understand is that:
`acquire` is `git pull`. All the read operations after it would see the latest.
`release` is `git push`. All the write operations before it would be visible by other threads which pulls.

please also note that `shared_ptr` (similarly here) does not protect you from incorrect use. When used properly, the copy operations would either copy from a nullptr or a shared_ptr which ref count is at least 1.. if you invoke a copy constructor and spawn a thread to assign the copied-from instance to nullptr. it is plain UB because you are writing and reading to the same shared_ptr object (data race) 

================
Comment at: libcxx/include/stop_token:34-37
+#include <__config>
+#include <__stop_token/stop_callback.h>
+#include <__stop_token/stop_source.h>
+#include <__stop_token/stop_token.h>
----------------
Mordante wrote:
> I assume the feature will get a FTM once done.
yes there is only one FTM for `stop_token` and `jthread`. will get the FTM once `jthread` is done

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145183/new/

https://reviews.llvm.org/D145183