[libcxx-commits] [PATCH] D151521: [libc++] Optimize transform_reduce for floating point types

Fri May 26 11:43:07 PDT 2023

ldionne added inline comments.

================
Comment at: libcxx/include/__numeric/reduce.h:38
+_LIBCPP_HIDE_FROM_ABI constexpr _Tp reduce(_InputIterator __first, _InputIterator __last, _Tp __init, _BinaryOp __b) {
+  _LIBCPP_CLANG_PRAGMA("clang loop vectorize(enable)")
+  for (; __first != __last; ++__first)
----------------
Same here.

================
Comment at: libcxx/include/__numeric/transform_reduce.h:51
+           __is_trivial_operation<_ReductionOp, _Tp, _Tp>::value &&
+           is_same_v<__iter_value_type<_InputIterator1>, _Tp> && is_same_v<__iter_value_type<_InputIterator2>, _Tp>)
+_LIBCPP_HIDE_FROM_ABI constexpr _Tp transform_reduce(
----------------
We shouldn't be optimizing for user-defined types even if the operation is "trivial" (also applies to the other algo).

This makes me think that we need to revisit our definition of a trivial operation. `std::plus<>` is trivial but only for a specific set of types. And in fact I am not certain what we mean by "is trivial" anymore.

================
Comment at: libcxx/include/__numeric/transform_reduce.h:59-63
+  _LIBCPP_CLANG_PRAGMA("clang loop vectorize(enable)")
+  for (; __first1 != __last1; ++__first1, (void)++__first2)
+    __init = __b1(__init, __b2(*__first1, *__first2));
+  return __init;
+}
----------------
I would call the `unseq` version in the PSTL instead, that's more general and we'll end up using the appropriate backend.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151521/new/

https://reviews.llvm.org/D151521