[libcxx-commits] [PATCH] D122780: Modify std::sort: add BlockQuickSort partitioning algorithm for arithmetic types
Nikolas Klauser via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Apr 22 11:51:50 PDT 2022
philnik added a comment.
I haven't checked the logic yet, because I'm still not convinced that it's a good idea. While the numbers for random data look great, every other pattern is pessimized a bit to a lot with trivial arithmetic types. The numbers for the non-arithmetic types look good across the board.
================
Comment at: libcxx/include/__algorithm/sort.h:234
template <class _Compare, class _RandomAccessIterator>
void __insertion_sort_3(_RandomAccessIterator __first, _RandomAccessIterator __last, _Compare __comp) {
typedef typename iterator_traits<_RandomAccessIterator>::difference_type difference_type;
----------------
Why is this named `_3`? That's quite confusing.
================
Comment at: libcxx/include/__algorithm/sort.h:268
+ if (__comp(*__i, *__j)) {
+ value_type __t(_VSTD::move(*__i));
+ _RandomAccessIterator __k = __j;
----------------
You can use `std::` in new code. Applies throughout the patch.
================
Comment at: libcxx/include/__algorithm/sort.h:358-359
+ static __storage_t __blsr(__storage_t x) {
+ // _blsr_u64 can be used here but it did not make any performance
+ // difference in practice.
+ return x ^ (x & -x);
----------------
I don't think the comment is necessary here. Could you add a `__libcpp_blsr` in the same style as the `__libcpp_popcount` functions?
================
Comment at: libcxx/include/__algorithm/sort.h:362-364
+ static int __clz(__storage_t x) { return __builtin_clzll(x); }
+ static int __ctz(__storage_t x) { return __builtin_ctzll(x); }
+ static int __popcount(__storage_t x) { return __builtin_popcountll(x); }
----------------
That would allow you to make this a template instead of having the same thing twice.
Do you even use `__32bit_set` anywhere? I couldn't find it.
Could you just use `uint64_t` and `uint32_t` directly and just call the `__libcpp` functions instead of having the extra indirection?
================
Comment at: libcxx/include/__algorithm/sort.h:399-400
+// pivot. Returns the iterator for the pivot and a bool value which is true if
+// the provided range is already sorted, false otherwise. We assume that the
+// length of the range is at least three elements.
+//
----------------
Could you add a `_LIBCPP_ASSERT` for the precondition?
================
Comment at: libcxx/include/__algorithm/sort.h:402-404
+// We term the partitioning algorithm as bitset partitioning since the
+// outcomes of the comparisons between the pivot and other elements are stored
+// in the fixed size bitsets.
----------------
Could you shorten this to something like `// __bitset_partition uses bitsets for storing the outcomes of the comparisons`? I think that makes it clear.
================
Comment at: libcxx/include/__algorithm/sort.h:407
+_LIBCPP_HIDE_FROM_ABI _VSTD::pair<_RandomAccessIterator, bool>
+__bitset_partition(_RandomAccessIterator __first, _RandomAccessIterator __last, _Compare __comp) {
+ typedef typename _VSTD::iterator_traits<_RandomAccessIterator>::value_type value_type;
----------------
Could you split this up into a few more functions? There are very clear sections to divide this functions.
================
Comment at: libcxx/include/__algorithm/sort.h:559-560
+// pivot. Returns the iterator for the pivot and a bool value which is true if
+// the provided range is already sorted, false otherwise. We assume that the
+// length of the range is at least three elements.
+template <class _RandomAccessIterator, class _Compare>
----------------
Again, please add a `_LIBCPP_ASSERT`.
================
Comment at: libcxx/include/__algorithm/sort.h:660-662
+ _LIBCPP_CONSTEXPR_AFTER_CXX11 difference_type __limit = 24;
+ // Lower bound for using Tuckey's ninther technique for median computation.
+ _LIBCPP_CONSTEXPR_AFTER_CXX11 difference_type __ninther_threshold = 128;
----------------
Why can't these be `_LIBCPP_CONSTEXPR`?
================
Comment at: libcxx/include/__algorithm/sort.h:685-686
}
- if (__len <= __limit) {
- _VSTD::__insertion_sort_3<_Compare>(__first, __last, __comp);
+ // Use insertion sort if the length of the range is below the specified
+ // limit.
+ if (__len < __limit) {
----------------
================
Comment at: libcxx/include/__algorithm/sort.h:750-751
}
- // sort smaller range with recursive call and larger with tail recursion elimination
- if (__i - __first < __last - __i) {
- _VSTD::__introsort<_Compare>(__first, __i, __comp, __depth);
- __first = ++__i;
- } else {
- _VSTD::__introsort<_Compare>(__i + difference_type(1), __last, __comp, __depth);
- __last = __i;
- }
+ // Sort the left partiton recursively and the right partition with tail
+ // recursion elimination.
+ _VSTD::__introsort<_Compare, _RandomAccessIterator, _UseBitSetPartition>(__first, __i, __comp, __depth, __leftmost);
----------------
================
Comment at: libcxx/include/__algorithm/sort.h:772-774
+ // Only use bitset partitioning for arithmetic types. We should also check
+ // that the default comparator is in use so that we are sure that there are no
+ // branches in the comparator.
----------------
Where do you check that the default comparator is used?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122780/new/
https://reviews.llvm.org/D122780
More information about the libcxx-commits
mailing list