[libcxx-commits] [PATCH] D144394: [libc++] Forward to std::{, w}memcmp in std::find

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Mar 16 09:27:23 PDT 2023


philnik created this revision.
Herald added a project: All.
philnik added a comment.
philnik updated this revision to Diff 504664.
philnik updated this revision to Diff 505808.
Herald added a subscriber: mikhail.ramalho.
ldionne published this revision for review.
Herald added a project: libc++.
Herald added a subscriber: libcxx-commits.
Herald added a reviewer: libc++.

Benchmarks:

  ---------------------------------------------------------------
  Benchmark                                   old             new
  ---------------------------------------------------------------
  bm_find<char>/1                         10.5 ns         10.5 ns
  bm_find<char>/2                         18.3 ns         10.4 ns
  bm_find<char>/3                         21.1 ns         10.4 ns
  bm_find<char>/4                         22.4 ns         10.3 ns
  bm_find<char>/5                         23.3 ns         10.3 ns
  bm_find<char>/6                         23.7 ns         10.4 ns
  bm_find<char>/7                         24.0 ns         10.3 ns
  bm_find<char>/8                         24.5 ns         10.2 ns
  bm_find<char>/16                        26.2 ns         10.2 ns
  bm_find<char>/64                        29.9 ns         20.8 ns
  bm_find<char>/512                       80.2 ns         36.0 ns
  bm_find<char>/4096                       523 ns         43.0 ns
  bm_find<char>/32768                     3838 ns          105 ns
  bm_find<char>/262144                   30832 ns          988 ns
  bm_find<char>/1048576                 122541 ns         4854 ns
  bm_find<short>/1                        10.5 ns         10.4 ns
  bm_find<short>/2                        18.3 ns         18.5 ns
  bm_find<short>/3                        21.2 ns         21.4 ns
  bm_find<short>/4                        22.6 ns         22.7 ns
  bm_find<short>/5                        23.6 ns         23.6 ns
  bm_find<short>/6                        24.1 ns         24.2 ns
  bm_find<short>/7                        24.5 ns         24.6 ns
  bm_find<short>/8                        24.6 ns         24.8 ns
  bm_find<short>/16                       26.1 ns         26.6 ns
  bm_find<short>/64                       33.3 ns         31.3 ns
  bm_find<short>/512                       134 ns         80.4 ns
  bm_find<short>/4096                      967 ns          499 ns
  bm_find<short>/32768                    7670 ns         3842 ns
  bm_find<short>/262144                  61106 ns        31371 ns
  bm_find<short>/1048576                244510 ns       122120 ns
  bm_find<int>/1                          10.4 ns         12.1 ns
  bm_find<int>/2                          18.4 ns         12.0 ns
  bm_find<int>/3                          21.9 ns         12.0 ns
  bm_find<int>/4                          23.3 ns         11.9 ns
  bm_find<int>/5                          24.2 ns         12.1 ns
  bm_find<int>/6                          24.7 ns         12.0 ns
  bm_find<int>/7                          24.8 ns         11.9 ns
  bm_find<int>/8                          25.2 ns         11.9 ns
  bm_find<int>/16                         26.2 ns         21.9 ns
  bm_find<int>/64                         30.5 ns         33.8 ns
  bm_find<int>/512                        80.4 ns         40.4 ns
  bm_find<int>/4096                        497 ns         71.7 ns
  bm_find<int>/32768                      3844 ns          498 ns
  bm_find<int>/262144                    30722 ns         4901 ns
  bm_find<int>/1048576                  123303 ns        22864 ns
  bm_ranges_find<char>/1                  10.4 ns         10.4 ns
  bm_ranges_find<char>/2                  18.3 ns         10.4 ns
  bm_ranges_find<char>/3                  21.1 ns         10.4 ns
  bm_ranges_find<char>/4                  22.3 ns         10.3 ns
  bm_ranges_find<char>/5                  23.2 ns         10.3 ns
  bm_ranges_find<char>/6                  23.6 ns         10.3 ns
  bm_ranges_find<char>/7                  24.0 ns         10.4 ns
  bm_ranges_find<char>/8                  24.4 ns         10.5 ns
  bm_ranges_find<char>/16                 26.0 ns         10.3 ns
  bm_ranges_find<char>/64                 30.6 ns         20.9 ns
  bm_ranges_find<char>/512                80.2 ns         36.2 ns
  bm_ranges_find<char>/4096                496 ns         43.1 ns
  bm_ranges_find<char>/32768              3844 ns          106 ns
  bm_ranges_find<char>/262144            30632 ns          994 ns
  bm_ranges_find<char>/1048576          123127 ns         5099 ns
  bm_ranges_find<short>/1                 10.5 ns         10.5 ns
  bm_ranges_find<short>/2                 18.4 ns         18.3 ns
  bm_ranges_find<short>/3                 21.2 ns         21.2 ns
  bm_ranges_find<short>/4                 22.7 ns         22.5 ns
  bm_ranges_find<short>/5                 23.6 ns         23.4 ns
  bm_ranges_find<short>/6                 24.1 ns         23.8 ns
  bm_ranges_find<short>/7                 24.4 ns         24.1 ns
  bm_ranges_find<short>/8                 24.7 ns         24.5 ns
  bm_ranges_find<short>/16                26.2 ns         26.1 ns
  bm_ranges_find<short>/64                33.2 ns         30.4 ns
  bm_ranges_find<short>/512                134 ns         80.4 ns
  bm_ranges_find<short>/4096               969 ns          506 ns
  bm_ranges_find<short>/32768             7648 ns         3853 ns
  bm_ranges_find<short>/262144           61198 ns        30507 ns
  bm_ranges_find<short>/1048576         248057 ns       122080 ns
  bm_ranges_find<int>/1                   10.5 ns         11.8 ns
  bm_ranges_find<int>/2                   18.5 ns         11.9 ns
  bm_ranges_find<int>/3                   22.0 ns         11.9 ns
  bm_ranges_find<int>/4                   23.4 ns         11.8 ns
  bm_ranges_find<int>/5                   24.3 ns         11.8 ns
  bm_ranges_find<int>/6                   24.7 ns         12.0 ns
  bm_ranges_find<int>/7                   24.8 ns         11.9 ns
  bm_ranges_find<int>/8                   25.1 ns         11.8 ns
  bm_ranges_find<int>/16                  26.2 ns         21.9 ns
  bm_ranges_find<int>/64                  30.7 ns         33.8 ns
  bm_ranges_find<int>/512                 80.4 ns         39.7 ns
  bm_ranges_find<int>/4096                 498 ns         72.6 ns
  bm_ranges_find<int>/32768               3842 ns          506 ns
  bm_ranges_find<int>/262144             30566 ns         5311 ns
  bm_ranges_find<int>/1048576           123103 ns        22743 ns


philnik added a comment.

Rebased


philnik added a comment.

Fix formatting



================
Comment at: libcxx/include/__algorithm/ranges_find.h:42
   _Ip operator()(_Ip __first, _Sp __last, const _Tp& __value, _Proj __proj = {}) const {
-    auto __pred = [&](auto&& __e) { return std::forward<decltype(__e)>(__e) == __value; };
-    return ranges::__find_if_impl(std::move(__first), std::move(__last), __pred, __proj);
+    if constexpr (is_copy_constructible_v<_Ip>) {
+      auto [__first_un, __last_un] = std::__unwrap_range(__first, std::move(__last));
----------------
Maybe this conveys the intent a bit better?


================
Comment at: libcxx/include/__string/constexpr_c_functions.h:68
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp* __constexpr_memchr(_Tp* __str, int __char, size_t __count) {
+  static_assert(sizeof(_Tp) == 1 && __is_trivially_equality_comparable<_Tp, _Tp>::value, "");
+
----------------



================
Comment at: libcxx/include/__string/constexpr_c_functions.h:70-85
 #if __has_builtin(__builtin_char_memchr)
-  return __builtin_char_memchr(__str, __char, __count);
-#else
-  if (!__libcpp_is_constant_evaluated())
-    return static_cast<const char*>(__builtin_memchr(__str, __char, __count));
-  for (; __count; --__count) {
-    if (*__str == __char)
-      return __str;
-    ++__str;
-  }
-  return nullptr;
+  if (is_same<_Tp, char>::value)
+    return __builtin_char_memchr(__str, __char, __count);
 #endif
+
+  if (__libcpp_is_constant_evaluated()) {
+    for (; __count; --__count) {
----------------
I would suggest reorganizing like this, to make it more obvious what the intent of using `__builtin_char_memchr` is.

```
if (__libcpp_is_constant_evaluated()) {
  // use __builtin_char_memchr to optimize constexpr evaluation if we can
#if __has_builtin(__builtin_char_memchr) 
  if (is_same<_Tp, char>::value)
    return __builtin_char_memchr(__str, __char, __count);
#endif

  for (; __count; --__count) {
    if (*__str == __char)
      return __str;
    ++__str;
  }
  return nullptr;
} else {
  return static_cast<_Tp*>(__builtin_memchr(__str, __char, __count));
}
```


================
Comment at: libcxx/include/cwchar:229
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp* __constexpr_wmemchr(_Tp* __str, _Up __char, size_t __count) {
+  static_assert(sizeof(_Tp) == sizeof(wchar_t) && __is_trivially_equality_comparable<_Tp, _Up>::value, "");
+  if (!__libcpp_is_constant_evaluated() || is_same<_Tp, wchar_t>::value) {
----------------
`static_assert("diagnostic")`

You also seem to be missing an important requirement on `_Up`? You need `_Up` to be trivially copyable to `wchar_t` or else your `std::wmemcpy` may not be valid? I would actually just replace `_Up` by `wchar_t` since we don't need the additional generalization, at least for now.


================
Comment at: libcxx/include/cwchar:230
+  static_assert(sizeof(_Tp) == sizeof(wchar_t) && __is_trivially_equality_comparable<_Tp, _Up>::value, "");
+  if (!__libcpp_is_constant_evaluated() || is_same<_Tp, wchar_t>::value) {
+    wchar_t __c;
----------------
You probably want to `remove_cv_t<_Tp>` here to also catch the case where `_Tp* == wchar_t const*`.


================
Comment at: libcxx/include/cwchar:231-232
+  if (!__libcpp_is_constant_evaluated() || is_same<_Tp, wchar_t>::value) {
+    wchar_t __c;
+    std::wmemcpy(&__c, reinterpret_cast<const wchar_t*>(&__char), 1);
+    return reinterpret_cast<_Tp*>(std::wmemchr(reinterpret_cast<const wchar_t*>(__str), __c, __count));
----------------
Now you don't need this anymore if you take `wchar_t __char` as input.


================
Comment at: libcxx/include/cwchar:233
+    std::wmemcpy(&__c, reinterpret_cast<const wchar_t*>(&__char), 1);
+    return reinterpret_cast<_Tp*>(std::wmemchr(reinterpret_cast<const wchar_t*>(__str), __c, __count));
+  }
----------------
Does `reintepret_cast<T>(object-of-type-T)` work inside `constexpr`? If not then this doesn't compile when constant-evaluated, and it means your test coverage is also insufficient in that area. Either way please make sure that is tested.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144394

Files:
  libcxx/benchmarks/CMakeLists.txt
  libcxx/benchmarks/algorithms/find.bench.cpp
  libcxx/include/__algorithm/find.h
  libcxx/include/__algorithm/ranges_find.h
  libcxx/include/__string/char_traits.h
  libcxx/include/__string/constexpr_c_functions.h
  libcxx/include/cwchar
  libcxx/test/libcxx/strings/c.strings/constexpr.cstring.compile.pass.cpp
  libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find.pass.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D144394.505808.patch
Type: text/x-patch
Size: 14414 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20230316/937a6362/attachment-0001.bin>


More information about the libcxx-commits mailing list