[libcxx-commits] [PATCH] D118029: Introduce branchless sorting functions for sort3, sort4 and sort5.

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Feb 1 06:45:57 PST 2022


philnik added a comment.

> Random 4 is actually very good. It's bad for deterministic orderings because the benchmark just keeps on calling sort4 with the same ordering over and over again, so the branch predictor has a very easy job. I suspect this type of data pattern is not very realistic.

Ok, that makes sense.
I checked the sorting logic now and that part LGTM. Now the only question is which types should be allowed.



================
Comment at: libcxx/include/__algorithm/sort.h:160
+inline _LIBCPP_HIDDEN
+    __enable_if_t<!_VSTD::is_integral< typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+    __sort3_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3, _Compare __c) {
----------------
marcogelmi wrote:
> marcogelmi wrote:
> > Quuxplusone wrote:
> > > philnik wrote:
> > > > Is there a reason you restricted this to integrals? Does it pessimize some code? I guessing at least all primitive types should be allowed. I haven't benchmarked, but looking at code generation it seems floating points should also benefit from this change.
> > > `is_scalar` might be relevant.
> > this relies on cmovs, on larger data types it could pessimize the code if it compiles to extra copies instead, or not compile at all if it's not trivially copyable.
> switched to is_arithmetic and added benchmarks here: https://bit.ly/3rgv2r9
Could we generalize it to something like `sizeof(typename iterator_traits<_RandomAccessIterator>::value_type) <= sizeof(size_t) && is_trivially_copyable_v<typename iterator_traits<_RandomAccessIterator>::value_type>`? I'm assuming here that `sizeof(size_t)` is less than or equal to the word size. This would allow this optimization to be performed on something like `std::optional<int>`. Could you check if that still yields a performance improvement? `std::optional` doesn't have a trivial comparison operator, so it might be interesting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118029/new/

https://reviews.llvm.org/D118029



More information about the libcxx-commits mailing list