[libcxx-commits] [PATCH] D118029: Introduce branchless sorting functions for sort3, sort4 and sort5.

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Feb 10 07:26:03 PST 2022


ldionne added inline comments.


================
Comment at: libcxx/include/__algorithm/sort.h:128
+template <typename _Compare, typename _RandomAccessIterator>
+inline _LIBCPP_HIDDEN void __cond_swap(_RandomAccessIterator __x, _RandomAccessIterator __y, _Compare __c) {
+  using value_type = typename iterator_traits<_RandomAccessIterator>::value_type;
----------------
marcogelmi wrote:
> ldionne wrote:
> > Also, does the `inline` make a difference in your benchmarks? If not, I'd drop it, because they are semantically equivalent (templates have inline semantics by default).
> > 
> > This comment applies below in several places too.
> Removing the inline seems to negatively affect the performance for some combinations of tests and architectures. Would it be ok to keep the inline for extra safety?
Can you share those numbers? IMO that's a LLVM bug, since we like to say that `inline` doesn't have any impact on the codegen. But yeah, at the end of the day, if it makes a positive difference we should keep it.


================
Comment at: libcxx/include/__algorithm/sort.h:201
+    __enable_if_t<!is_arithmetic<typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+    __sort5_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3,
+                     _RandomAccessIterator __x4, _RandomAccessIterator __x5, _Compare __c) {
----------------
marcogelmi wrote:
> ldionne wrote:
> > Instead, would it be possible to put the `enable_if_t` on `__sort5` directly, and then only add one overload of `__sort5` for when `is_arithmetic` is true? This would avoid introducing a new layer of complexity like `__sort5_internal`. This applies elsewhere too.
> unfortunately the original version of __sortN also returns the number of swaps, while the new one doesn't, so we need both versions
Right -- that actually highlights the problem I have with the name chosen (`__sort5_internal`). There's nothing "internal" about that version of `__sort5`. Instead, we could rename it to `__sort5_maybe_branchless`? It's not great, but at least it explains what it does.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118029/new/

https://reviews.llvm.org/D118029



More information about the libcxx-commits mailing list