[libcxx-commits] [PATCH] D118029: Introduce branchless sorting functions for sort3, sort4 and sort5.
Louis Dionne via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Feb 7 12:58:57 PST 2022
ldionne requested changes to this revision.
ldionne added a comment.
This revision now requires changes to proceed.
Thanks for the patch. So, to summarize, these ternary conditionals generate `cmov` instead of normal branches when the types involved are arithmetic, right? And that's basically where the improvements come from.
This looks sensible to me, the benchmarks show a fairly consistent improvement for the sizes that we care most about.
================
Comment at: libcxx/include/__algorithm/sort.h:128
+template <typename _Compare, typename _RandomAccessIterator>
+inline _LIBCPP_HIDDEN void __cond_swap(_RandomAccessIterator __x, _RandomAccessIterator __y, _Compare __c) {
+ using value_type = typename iterator_traits<_RandomAccessIterator>::value_type;
----------------
Also, does the `inline` make a difference in your benchmarks? If not, I'd drop it, because they are semantically equivalent (templates have inline semantics by default).
This comment applies below in several places too.
================
Comment at: libcxx/include/__algorithm/sort.h:201
+ __enable_if_t<!is_arithmetic<typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+ __sort5_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3,
+ _RandomAccessIterator __x4, _RandomAccessIterator __x5, _Compare __c) {
----------------
Instead, would it be possible to put the `enable_if_t` on `__sort5` directly, and then only add one overload of `__sort5` for when `is_arithmetic` is true? This would avoid introducing a new layer of complexity like `__sort5_internal`. This applies elsewhere too.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118029/new/
https://reviews.llvm.org/D118029
More information about the libcxx-commits
mailing list