[libcxx-commits] [libcxx] [libc++] Vectorize std::mismatch with trivially equality comparable types (PR #87716)

Louis Dionne via libcxx-commits libcxx-commits at lists.llvm.org
Fri Apr 5 10:25:07 PDT 2024


================
@@ -119,6 +120,31 @@ __mismatch(_Tp* __first1, _Tp* __last1, _Tp* __first2, _Pred& __pred, _Proj1& __
   return std::__mismatch_loop(__first1, __last1, __first2, __pred, __proj1, __proj2);
 }
 
+template <class _Tp,
+          class _Pred,
+          class _Proj1,
+          class _Proj2,
+          __enable_if_t<!is_integral<_Tp>::value && __desugars_to_v<__equal_tag, _Pred, _Tp, _Tp> &&
+                            __is_identity<_Proj1>::value && __is_identity<_Proj2>::value &&
+                            __can_map_to_integer_v<_Tp> && __libcpp_is_trivially_equality_comparable<_Tp, _Tp>::value,
+                        int> = 0>
+_LIBCPP_NODISCARD _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 pair<_Tp*, _Tp*>
+__mismatch(_Tp* __first1, _Tp* __last1, _Tp* __first2, _Pred& __pred, _Proj1& __proj1, _Proj2& __proj2) {
+  if (__libcpp_is_constant_evaluated()) {
+    return std::__mismatch_loop(__first1, __last1, __first2, __pred, __proj1, __proj2);
+  } else {
+    using __integer_t = __copy_cv_t<_Tp, __get_as_integer_type<_Tp>>;
+    // This is valid because we disable TBAA when loading vectors. Alignment requirements still have to be fulfilled.
+    auto __ret = std::__mismatch(
+        reinterpret_cast<__integer_t*>(__first1),
----------------
ldionne wrote:

I think this is really nice, but I am not super comfortable with the level of "playing with fire" we're doing here. This is technically UB, and in fact during this review you even discovered a case where this is actually UB (when we call `__mismatch_loop`).

If we want to move forward with this patch, I'd like to have a more principled way of ensuring that the compiler understands what we're doing. It would also be nice not to have to use `__may_alias__` unconditionally in `__load_vector`, since that's a really low level function. It seems wrong to couple that function with this higher-level `mismatch` optimization.

https://github.com/llvm/llvm-project/pull/87716


More information about the libcxx-commits mailing list