[libcxx-commits] [PATCH] D139554: [libc++] Forward to std::memcmp for trivially comparable types in equal

Thu Feb 2 09:52:35 PST 2023

ldionne requested changes to this revision.
ldionne added a comment.
This revision now requires changes to proceed.

I like this change a lot, we seem to be the only implementation that doesn't do that yet.

================
Comment at: libcxx/benchmarks/CMakeLists.txt:110-111
 target_compile_options(cxx-benchmarks-flags-libcxx INTERFACE ${SANITIZER_FLAGS} -Wno-user-defined-literals -Wno-suggest-override)
 target_link_options(   cxx-benchmarks-flags-libcxx INTERFACE -nodefaultlibs "-L${BENCHMARK_LIBCXX_INSTALL}/lib" ${SANITIZER_FLAGS})
+target_link_options(   cxx-benchmarks-flags-libcxx INTERFACE -nodefaultlibs "-L${BENCHMARK_LIBCXX_INSTALL}/lib64" ${SANITIZER_FLAGS})

----------------
Otherwise you are also duplicating the other flags (sanitizer flags and `-nodefaultlibs`).

================
Comment at: libcxx/benchmarks/algorithms/equal.bench.cpp:11
+#include <benchmark/benchmark.h>
+#include <cstring>
+
----------------
`#include <vector>`

================
Comment at: libcxx/include/__algorithm/equal.h:1
 // -*- C++ -*-
 //===----------------------------------------------------------------------===//
----------------
We should release-note this change.

================
Comment at: libcxx/include/__algorithm/equal.h:36
 template <class _InputIterator1, class _InputIterator2, class _BinaryPredicate>
-_LIBCPP_NODISCARD_EXT inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_SINCE_CXX20 bool
-equal(_InputIterator1 __first1, _InputIterator1 __last1, _InputIterator2 __first2, _BinaryPredicate __pred) {
+_LIBCPP_NODISCARD inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_SINCE_CXX20 bool __equal_iter_trivial_impl(
+    _InputIterator1 __first1, _InputIterator1 __last1, _InputIterator2 __first2, _BinaryPredicate& __pred) {
----------------
`__equal_iter_trivial_impl` is confusing -- it's the function that is used for non-trivially comparable types.

================
Comment at: libcxx/include/__algorithm/equal.h:47
+    _InputIterator1 __first1, _InputIterator1 __last1, _InputIterator2 __first2, _BinaryPredicate& __pred) {
+  return std::__equal_iter_trivial_impl(__first1, __last1, __first2, __pred);
+}
----------------
With the second suggestion below, you can actually inline `__equal_iter_loop` into this function directly.

================
Comment at: libcxx/include/__algorithm/equal.h:58-62
+__equal_iter_impl(_Tp* __first1, _Tp* __last1, _Up* __first2, _BinaryPredicate& __pred) {
+  if (!__libcpp_is_constant_evaluated() || (sizeof(_Tp) == 1 && !is_same<__remove_cv_t<_Tp>, bool>::value))
+    return std::__constexpr_memcmp(__first1, __first2, (__last1 - __first1) * sizeof(_Tp)) == 0;
+  return std::__equal_iter_trivial_impl(__first1, __last1, __first2, __pred);
+}
----------------
I would simplify it to this. The current code has the following problems iMO:
- It's checking that we're not in a constant expression and then it's calling `__constexpr_memcmp`, which is confusing.
- It's checking for `sizeof(_Tp) == 1 && !is_same<__remove_cv_t<_Tp>, bool>::value` which seems rather clang specific (clang's `__builtin_memcmp` apparently works at compile-time with types that satisfy that). Since this is only a compile-time optimization, I would keep the code simple instead.

Otherwise, the other option I can see is:

```
template <class _Tp>
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 int
__constexpr_memcmp(const _Tp* __lhs, const _Tp* __rhs, size_t __count) {
  ... additional logic to be fast at compile-time when we're on Clang ...
}

__equal_iter_impl(_Tp* __first1, _Tp* __last1, _Up* __first2, _BinaryPredicate& /* unused */) {
  return std::__constexpr_memcmp(__first1, __first2, (__last1 - __first1) * sizeof(_Tp)) == 0;
}
```

I actually think I like this second option best.

================
Comment at: libcxx/include/__algorithm/equal.h:89
+template <class _Iter1, class _Sent1, class _Iter2, class _Sent2, class _Pred, class _Proj1, class _Proj2>
+_LIBCPP_NODISCARD inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 bool __equal_trivial_impl(
+    _Iter1 __first1, _Sent1 __last1, _Iter2 __first2, _Sent2 __last2, _Pred& __comp, _Proj1& __proj1, _Proj2& __proj2) {
----------------
Since we're about to make clang-format mandatory, I think this should be formatted as

```
_LIBCPP_NODISCARD inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
bool __equal_trivial_impl(args...) {
  ...
}
```

We have a lot of long lists of attributes in libc++ so this is going to be a common occurrence. Leaving the tiny name of the function alone at the end of the line IMO makes this really hard to read. If there's no way to make this work with clang-format built-in, I would do this:

```
_LIBCPP_NODISCARD inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 //
bool __equal_trivial_impl(args...) {
  ...
}
```

If clang-format had an option to prefer breaking after the attributes (or after the return type), that would be better than the status quo.

================
Comment at: libcxx/include/__algorithm/equal.h:117-120
+    _Tp* __first1, _Tp* __last1, _Up* __first2, _Up* __last2, _Pred& __comp, _Proj1& __proj1, _Proj2& __proj2) {
+  if (!__libcpp_is_constant_evaluated() || sizeof(_Tp) == 1)
+    return std::__constexpr_memcmp(__first1, __first2, (__last1 - __first1) * sizeof(_Tp)) == 0;
+  return std::__equal_trivial_impl(__first1, __last1, __first2, __last2, __comp, __proj1, __proj2);
----------------
Same suggestion here.

================
Comment at: libcxx/include/__algorithm/make_projected.h:58-65
 template <class _Pred, class _Proj, class = void>
 struct __can_use_pristine_comp : false_type {};

 template <class _Pred, class _Proj>
-struct __can_use_pristine_comp<_Pred, _Proj, __enable_if_t<
-    !is_member_pointer<typename decay<_Pred>::type>::value && (
-#if _LIBCPP_STD_VER > 17
-      is_same<typename decay<_Proj>::type, identity>::value ||
-#endif
-      is_same<typename decay<_Proj>::type, __identity>::value
-    )
-> > : true_type {};
+struct __can_use_pristine_comp<_Pred,
+                               _Proj,
+                               __enable_if_t<!is_member_pointer<typename decay<_Pred>::type>::value &&
----------------
I would considering inlining this below. I don't think it pulls its weigh anymore after this simplification.

================
Comment at: libcxx/include/__algorithm/make_projected.h:101-102
 decltype(auto) __make_projected_comp(_Comp& __comp, _Proj1& __proj1, _Proj2& __proj2) {
   if constexpr (same_as<decay_t<_Proj1>, identity> && same_as<decay_t<_Proj2>, identity> &&
                 !is_member_pointer_v<decay_t<_Comp>>) {
     // Avoid creating the lambda and just use the pristine comparator -- for certain algorithms, this would enable
----------------

================
Comment at: libcxx/include/__type_traits/is_comparable.h:9
+
+#ifndef _LIBCPP___TYPE_TRAITS_IS_COMPARABLE_H
+#define _LIBCPP___TYPE_TRAITS_IS_COMPARABLE_H
----------------
Let's name this `is_equality_comparable.h` since this is about equality.

================
Comment at: libcxx/include/__type_traits/is_comparable.h:34-35
+
+template <class _Tp, class _Up>
+struct __is_trivially_equality_comparable
+    : integral_constant<bool,
----------------
Please write a comment explaining which types are considered trivially comparable and what's the reasoning behind that. In particular, please cover why e.g. `float` is not included, why it's OK to disregard cv-qualifiers, etc. This is a pretty fundamental type trait, it's worth putting some effort into it.

================
Comment at: libcxx/include/__type_traits/is_comparable.h:41-42
+// TODO: Use is_pointer_inverconvertible_base_of
+template <class _Tp, class _Up>
+struct __is_trivially_equality_comparable<_Tp*, _Up*>
+    : integral_constant<
----------------
Can you please explain in a comment why this specialization is the way it is. What are the pitfalls, etc.

================
Comment at: libcxx/include/__type_traits/is_predicate_kind.h:1
+//===----------------------------------------------------------------------===//
+//
----------------
I would suggest naming this file `predicate_traits.h`. We can then have

```
__is_trivial_equality_predicate<Pred, T1, T2>
__is_trivial_weak_order_predicate<Pred, T1, T2> // to be added in a separate patch
```

Also `__is_simple_comparator` from `sort.h` should move here instead.

================
Comment at: libcxx/include/cstring:117

-template <class _Tp>
+template <class _Tp, class _Up>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 int
----------------
I think we should catch the case where we're calling this function with types for which `memcmp` doesn't do the right thing. I'd be happy with even a `static_assert`, I just think it would be silly to have a bug where we e.g. compare virtual-base/derived pointers and do the wrong thing at runtime when it could be avoided so easily.

================
Comment at: libcxx/test/support/test_iterators.h:1448-1449

+template <class T>
+using cv_qualified_ptr = type_list<T*, const T*, volatile T*, const volatile T*>;
 } // namespace meta
----------------
I think it would be better to instead introduce `cv_qualified_versions<T>`. Then you call it as `cv_qualified_versions<T>` and apply the pointer outside of that -- that'll be more reusable if we need this  facility without a pointer type.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139554/new/

https://reviews.llvm.org/D139554