[libcxx-commits] [PATCH] D130695: [libc++][ranges]Refactor `copy{, _backward}` and `move{, _backward}`

Fri Sep 9 06:56:07 PDT 2022

ldionne added inline comments.

================
Comment at: libcxx/include/__algorithm/copy_move_common.h:54-87
+// Wrapper functors with SFINAE.
+
+struct __copy_trivial {
+  // At this point, the iterators have been unwrapped so any `contiguous_iterator` has been unwrapped to a pointer.
+  template <class _In, class _Out, __enable_if_t< is_trivially_assignable<_Out&, _In&>::value, int > = 0>
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 pair<_In*, _Out*>
+  operator()(_In* __first, _In* __last, _Out* __result) const {
----------------
IMO those should be defined in `copy.h`, `move.h`, `copy_backward.h` and `move_backward.h` instead.

================
Comment at: libcxx/include/__algorithm/copy_move_common.h:34
+  const size_t __n = static_cast<size_t>(__last - __first);
+  std::memmove(__result, __first, __n * sizeof(_Out));
+
----------------
var-const wrote:
> philnik wrote:
> > var-const wrote:
> > > philnik wrote:
> > > > ldionne wrote:
> > > > > philnik wrote:
> > > > > > Why did you get rid of calling `__builtin_memmove` during constant evaluation?
> > > > > GCC didn't support it, which added a bunch of complexity. So now what we do instead is unwrap the iterators and use the loop version when we're inside constant evaluation. The performance during constant evaluation is less of a concern than performance at runtime.
> > > > AFAICT the only complexity added by GCC not supporting it is an extra `#ifndef _LIBCPP_COMPILER_GCC`. How does calling `__builtin_memmove` result in worse runtime performance? IMO we shouldn't massively pessimize constant evaluation performance for removing a few lines of code.
> > > > 
> > > > For example:
> > > > ```
> > > > #include <cstddef>
> > > > 
> > > > constexpr void copy_arrays(int* begin, int* end, int* out) {
> > > > #ifdef USE_MEMMOVE
> > > >   ::__builtin_memmove(out, begin, static_cast<std::size_t>(end - begin));
> > > > #else
> > > >   while (begin != end) {
> > > >     *out = *begin;
> > > >     ++begin;
> > > >     ++out;
> > > >   }
> > > > #endif
> > > > }
> > > > 
> > > > constexpr bool use_data() {
> > > >   int a[100000] = {};
> > > >   int b[100000] = {};
> > > >   ::copy_arrays(a, a + 100000, b);
> > > > 
> > > >   return true;
> > > > }
> > > > 
> > > > static_assert(use_data());
> > > > ```
> > > > Without using `__builtin_memmove` takes 200ms to compile, with `__builtin_memmove` is takes 30ms.
> > > The original form of this code was (with slight NFC reformatting):
> > > ```
> > > if (__libcpp_is_constant_evaluated()
> > > #ifndef _LIBCPP_COMPILER_GCC
> > >       && !is_trivially_copyable<_InValueT>::value
> > > #endif
> > >      ) {
> > >     return std::__copy_impl<_InValueT*, _InValueT*, _OutValueT*>(__first, __last, __result);
> > >   }
> > >   const size_t __n = static_cast<size_t>(__last - __first);
> > >   if (__n > 0)
> > >     ::__builtin_memmove(__result, __first, __n * sizeof(_OutValueT));
> > >   return std::make_pair(__first + __n, __result + __n);
> > > ```
> > > Unless I'm missing something obvious, `__builtin_memmove` was never called during constant evaluation.
> > To me it looks like it was called when the type `is_trivially_copyable`.
> Okay, I restored using `__builtin_memmove` during constant evaluation, with the necessary additional conditions. I also added a test to make sure the `is_trivially_copyable` condition is honored.
> 
> I'm still not sure if it's worth the extra complexity (although now the complexity is all in one place, making this a little more manageable). @ldionne What do you think? I'm somewhat on the fence and can be convinced either way.
Yeah, I think restoring the `__builtin_memmove` behavior is probably worth it given the surprising (to me) compile-time speedup.

================
Comment at: libcxx/include/__algorithm/move.h:32
+    while (__first != __last) {
+      *__result = _IterOps<_AlgPolicy>::__iter_move(__first);
+      ++__first;
----------------
var-const wrote:
> huixie90 wrote:
> > question: if the input is  a contiguous iterator pointing to a trivially copyable/moveable type, but the iterator has specialised `iter_move` ( yes it is very contrived), would the `move` dispatch to the overload that unwraps the  contiguous iterator to pointer and call `memmove` without going through the specialised `iter_move`?
> I think this is a great point, thanks. I'm inclined to leave it as is for this patch because it's a preexistent thing, but I will address it in a follow-up.
Is this really something we should fix? Aren't we allowed to bypass the iterator's `iter_move`?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130695/new/

https://reviews.llvm.org/D130695