[libcxx-commits] [PATCH] D140779: Use hash value checks optimizations consistently

Dmitry Ilvokhin via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Fri Dec 30 06:07:24 PST 2022


ilvokhin created this revision.
ilvokhin added reviewers: ldionne, EricWF, libc++.
Herald added a project: All.
ilvokhin requested review of this revision.
Herald added a project: libc++.
Herald added a subscriber: libcxx-commits.
Herald added 1 blocking reviewer(s): libc++.

There are couple of optimizations of `__hash_table::find` which are applicable
to other places like `__hash_table::__node_insert_unique_prepare` and
`__hash_table::__emplace_unique_key_args`.

  for (__nd = __nd->__next_; __nd != nullptr &&
      (__nd->__hash() == __hash
      // ^^^^^^^^^^^^^^^^^^^^^^
      //         (1)
        || std::__constrain_hash(__nd->__hash(), __bc) == __chash);
                                                 __nd = __nd->__next_)
  {   
      if ((__nd->__hash() == __hash)
      // ^^^^^^^^^^^^^^^^^^^^^^^^^^
      //           (2)
          && key_eq()(__nd->__upcast()->__value_, __k))
          return iterator(__nd, this);
  }

(1): We can avoid expensive modulo operations from `std::__constrain_hash` if
hashes matched. This one is from commit 6a411472e3c4.
(2): We can avoid `key_eq` calls if hashes didn't match. Commit:
318d35a7bca6c4e5.

Both of them are applicable for inser and emplace methods.

Results of unordered_set_operations benchmark:

  Before:
  --------------------------------------------------------------------------------------------------
  Benchmark                                                        Time             CPU   Iterations
  --------------------------------------------------------------------------------------------------
  BM_InsertValue/unordered_set_uint32/1024                     38521 ns        38520 ns        18174
  BM_InsertValue/unordered_set_uint32_sorted/1024              29297 ns        29296 ns        23804
  BM_InsertValue/unordered_set_top_bits_uint32/1024            32547 ns        32546 ns        21655
  BM_InsertValueRehash/unordered_set_top_bits_uint32/1024      56896 ns        56895 ns        12334
  BM_InsertValue/unordered_set_string/1024                    277956 ns       277952 ns         2487
  BM_InsertValueRehash/unordered_set_string/1024              578982 ns       578912 ns         1224
  BM_InsertDuplicate/unordered_set_int/1024                    11793 ns        11793 ns        61427
  BM_InsertDuplicate/unordered_set_string/1024                164009 ns       164005 ns         4339
  BM_EmplaceDuplicate/unordered_set_int/1024                   12728 ns        12728 ns        57487
  BM_EmplaceDuplicate/unordered_set_string/1024               164516 ns       164512 ns         4251
  BM_InsertDuplicate/unordered_set_int_insert_arg/1024         11366 ns        11365 ns        62265
  BM_InsertDuplicate/unordered_set_string_insert_arg/1024     165776 ns       165772 ns         4225
  BM_EmplaceDuplicate/unordered_set_int_insert_arg/1024        42768 ns        42765 ns        16301
  BM_EmplaceDuplicate/unordered_set_string_arg/1024           371531 ns       371524 ns         1885
  
  After:
  --------------------------------------------------------------------------------------------------
  Benchmark                                                        Time             CPU   Iterations
  --------------------------------------------------------------------------------------------------
  BM_InsertValue/unordered_set_uint32/1024                     38988 ns        38985 ns        17789
  BM_InsertValue/unordered_set_uint32_sorted/1024              28707 ns        28707 ns        24285
  BM_InsertValue/unordered_set_top_bits_uint32/1024            32790 ns        32788 ns        21272
  BM_InsertValueRehash/unordered_set_top_bits_uint32/1024      58293 ns        58281 ns        12329
  BM_InsertValue/unordered_set_string/1024                    303982 ns       303894 ns         2352
  BM_InsertValueRehash/unordered_set_string/1024              608259 ns       608121 ns         1133
  BM_InsertDuplicate/unordered_set_int/1024                    12377 ns        12375 ns        56721
  BM_InsertDuplicate/unordered_set_string/1024                167777 ns       167752 ns         4167
  BM_EmplaceDuplicate/unordered_set_int/1024                   11469 ns        11468 ns        61094
  BM_EmplaceDuplicate/unordered_set_string/1024               165482 ns       165450 ns         4245
  BM_InsertDuplicate/unordered_set_int_insert_arg/1024         11945 ns        11945 ns        57164
  BM_InsertDuplicate/unordered_set_string_insert_arg/1024     164507 ns       164482 ns         4294
  BM_EmplaceDuplicate/unordered_set_int_insert_arg/1024        41152 ns        41151 ns        17134
  BM_EmplaceDuplicate/unordered_set_string_arg/1024           392580 ns       392529 ns         1808

Time differences looks more like noise to me. But if we want to have this
optimizations in `find`, we probably want them in `insert` and `emplace` as
well.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D140779

Files:
  libcxx/include/__hash_table


Index: libcxx/include/__hash_table
===================================================================
--- libcxx/include/__hash_table
+++ libcxx/include/__hash_table
@@ -1794,10 +1794,12 @@
         if (__ndptr != nullptr)
         {
             for (__ndptr = __ndptr->__next_; __ndptr != nullptr &&
-                                             std::__constrain_hash(__ndptr->__hash(), __bc) == __chash;
+                    (__ndptr->__hash() == __hash ||
+                     std::__constrain_hash(__ndptr->__hash(), __bc) == __chash);
                                                      __ndptr = __ndptr->__next_)
             {
-                if (key_eq()(__ndptr->__upcast()->__value_, __value))
+                if ((__ndptr->__hash() == __hash) &&
+                    key_eq()(__ndptr->__upcast()->__value_, __value))
                     return __ndptr;
             }
         }
@@ -2009,7 +2011,8 @@
                 (__nd->__hash() == __hash || std::__constrain_hash(__nd->__hash(), __bc) == __chash);
                                                            __nd = __nd->__next_)
             {
-                if (key_eq()(__nd->__upcast()->__value_, __k))
+                if ((__nd->__hash() == __hash) &&
+                    key_eq()(__nd->__upcast()->__value_, __k))
                     goto __done;
             }
         }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D140779.485690.patch
Type: text/x-patch
Size: 1360 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20221230/1f9dd1e8/attachment.bin>


More information about the libcxx-commits mailing list