[all-commits] [llvm/llvm-project] 9b78dd: [mlir] Optimize ThreadLocalCache by removing atomi...

Jeff Niu via All-commits all-commits at lists.llvm.org
Fri Jun 21 15:03:58 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9b78ddf3b2abfb3e2063e3dad2a326f5eabc1618
      https://github.com/llvm/llvm-project/commit/9b78ddf3b2abfb3e2063e3dad2a326f5eabc1618
  Author: Jeff Niu <jeff at modular.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M mlir/include/mlir/Support/ThreadLocalCache.h

  Log Message:
  -----------
  [mlir] Optimize ThreadLocalCache by removing atomic bottleneck (attempt #3) (#93315)

The ThreadLocalCache implementation is used by the MLIRContext (among
other things) to try to manage thread contention in the StorageUniquers.
There is a bunch of fancy shared pointer/weak pointer setups that
basically keeps everything alive across threads at the right time, but a
huge bottleneck is the `weak_ptr::lock` call inside the `::get` method.

This is because the `lock` method has to hit the atomic refcount several
times, and this is bottlenecking performance across many threads.
However, all this is doing is checking whether the storage is
initialized. Importantly, when the `PerThreadInstance` goes out of
scope, it does not remove all of its associated entries from the
thread-local hash map (it contains dangling `PerThreadInstance *` keys).
The `weak_ptr` also allows the thread local cache to synchronize with
the `PerThreadInstance`'s destruction:

1. if `ThreadLocalCache` destructs, the `weak_ptr`s that reference its
contained values are immediately invalidated
2. if `CacheType` destructs within a thread, any entries still live are
removed from the owning `PerThreadInstance`, and it locks the `weak_ptr`
first to ensure it's kept alive long enough for the removal.

This PR changes the TLC entries to contain a `shared_ptr<ValueT*>` and a
`weak_ptr<PerInstanceState>`. It gives the `PerInstanceState` entries a
`weak_ptr<ValueT*>` on top of the `unique_ptr<ValueT>`. This enables
`ThreadLocalCache::get` to check if the value is initialized by
dereferencing the `shared_ptr<ValueT*>` and check if the contained
pointer is null. When `PerInstanceState` destructs, the values inside
the TLC are written to nullptr. The TLC uses the
`weak_ptr<PerInstanceState>` to satisfy (2).

(1) is no longer the case. When `ThreadLocalCache` begins destruction,
the `weak_ptr<PerInstanceState>` are invalidated, but not the
`shared_ptr<ValueT*>`. This is OK: because the overall object is being
destroyed, `::get` cannot get called and because the
`shared_ptr<PerInstanceState>` finishes destruction before freeing the
pointer, it cannot get reallocated to another `ThreadLocalCache` during
destruction. I.e. the values inside the TLC associated with a
`PerInstanceState` cannot be read during destruction. The most important
thing is to make sure destruction of the TLC doesn't race with the
destructor of `PerInstanceState`. Because `PerInstanceState` carries
`weak_ptr` references into the TLC, we guarantee to not have any
use-after-frees.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list