[libcxx-commits] [PATCH] D151717: [libc++][PSTL] Add a GCD backend
Louis Dionne via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Jul 10 14:46:54 PDT 2023
ldionne added inline comments.
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:47-48
+
+// Preliminary size of each chunk: requires further discussion
+constexpr ptrdiff_t __default_chunk_size = 2048;
+
----------------
Not used anymore.
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:120-132
+ auto __compute_chunk = [&__last1, &__last2, __iterate_first_range, &__comp](
+ auto& __iter1, auto& __iter2, auto& __out, size_t __increment_count) -> __merge_range_t {
+ auto [__mid1, __mid2] = [&] {
+ if (__iterate_first_range) {
+ auto __m1 = __iter1 + __increment_count;
+ auto __m2 = std::lower_bound(__iter2, __last2, __m1[-1], __comp);
+ return std::make_pair(__m1, __m2);
----------------
As discussed, this seems to be a bit simplified:
```
auto __compute_chunk = [&](size_t __chunk_size) -> __merge_range_t {
auto [__mid1, __mid2] = [&] {
if (__iterate_first_range) {
auto __m1 = __first1 + __chunk_size;
auto __m2 = std::lower_bound(__first2, __last2, __m1[-1], __comp);
return std::make_pair(__m1, __m2);
} else {
auto __m2 = __first2 + __chunk_size;
auto __m1 = std::lower_bound(__first1, __last1, __m2[-1], __comp);
return std::make_pair(__m1, __m2);
}
}();
__merge_range_t __ret{__mid1, __mid2, __result};
__result += (__mid1 - __first1) + (__mid2 - __first2);
__first1 = __mid1;
__first2 = __mid2;
return __ret;
};
```
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:150
+
+ __libdispatch::__dispatch_apply(__ranges.size(), [&](size_t __index) {
+ auto __last_iters = __ranges[__index];
----------------
As discussed live, you could emplace_back the first iterators of the entire range manually into the vector, and then you'd be able to remove the special casing for `__index == 0`. You'd always look at `__ranges[__index]` and `__ranges[__index + 1]` here.
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:198-200
+ __reduction(__first + __index + 2,
+ __first + __index + __this_chunk_size,
+ __combiner(__transform((__first + __index)[0], (__first + __index)[1]))));
----------------
This doesn't work if you have only 1 element in a chunk (or if that's the case for the `__first_chunk_size`). We should have a test that exercises that. I guess the expected behavior in that case is that we'd just use `std::construct_at(__values + __index, __transform(*__first));`
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:204
+ return std::__terminate_on_exception([&] {
+ return std::transform_reduce(__values, __values + __partitions.__chunk_count_, __init, __transform, __reduction);
+ });
----------------
This should only be a `reduce`. This should be catchable if the transform returns a different type. Test coverage should be added for that.
================
Comment at: libcxx/include/__algorithm/pstl_backends/cpu_backends/libdispatch.h:212
+ _RandomAccessIterator __first, _RandomAccessIterator __last, _Comp __comp, _LeafSort __leaf_sort) {
+ __leaf_sort(__first, __last, __comp);
+}
----------------
Can we instead run the `sort` in parallel by chunking and then run a serial merge? That's not too hard to implement and it's much better than doing everything serial. We can leave a TODO for parallelizing the `merge` since we don't have a clear way to do it.
================
Comment at: libcxx/test/std/algorithms/alg.modifying.operations/alg.copy/fuzz.pstl.copy.sh.cpp:1
+//===----------------------------------------------------------------------===//
+//
----------------
Those tests should be moved to the `fuzz` patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D151717/new/
https://reviews.llvm.org/D151717
More information about the libcxx-commits
mailing list