[Mlir-commits] [mlir] [mlir] Optimize ThreadLocalCache by removing atomic bottleneck (attempt #3) (PR #93315)

Sun Jun 9 15:37:20 PDT 2024

================
@@ -25,28 +25,80 @@ namespace mlir {
 /// cache has very large lock contention.
 template <typename ValueT>
 class ThreadLocalCache {
+  struct PerInstanceState;
+
+  /// The "observer" is owned by a thread-local cache instance. It is
+  /// constructed the first time a `ThreadLocalCache` instance is accessed by a
+  /// thread, unless `perInstanceState` happens to get re-allocated to the same
+  /// address as a previous one. A `thread_local` instance of this class is
+  /// destructed when the thread in which it lives is destroyed.
+  ///
+  /// This class is called the "observer" because while values cached in
+  /// thread-local caches are owned by `PerInstanceState`, a reference is stored
+  /// via this class in the TLC. With a double pointer, it knows when the
+  /// referenced value has been destroyed.
+  struct Observer {
+    /// This is the double pointer, explicitly allocated because we need to keep
+    /// the address stable if the TLC map re-allocates. It is owned by the
+    /// observer and shared with the value owner.
+    std::shared_ptr<ValueT *> ptr = std::make_shared<ValueT *>(nullptr);
+    /// Because `Owner` living inside `PerInstanceState` contains a reference to
+    /// the double pointer, and livkewise this class contains a reference to the
----------------
jpienaar wrote:

likewise

https://github.com/llvm/llvm-project/pull/93315