[libcxx-commits] [PATCH] D118029: Introduce branchless sorting functions for sort3, sort4 and sort5.

Arthur O'Dwyer via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Jan 26 11:18:33 PST 2022


Quuxplusone added a comment.

> I noticed that the benchmarks for size 4 are generally pretty bad. Do you know if anything can be done there?

I noticed this also, but assume that we don't really care about the performance of `std::sort` on arrays of size 4 (or less). ;) If we're trading some perf on unrealistically small arrays, for better perf on large arrays, that sounds like a good tradeoff to me.
But I, also, have not particularly reviewed this code. Just glanced briefly.



================
Comment at: libcxx/include/__algorithm/sort.h:152
+inline _LIBCPP_HIDDEN
+    __enable_if_t<_VSTD::is_integral< typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+    __sort3_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3, _Compare __c) {
----------------
philnik wrote:
> Ditto below
And vice versa, `_VSTD::__sort3_internal`. All function calls (that take arguments of user-defined types, anyway) need qualification specifically so that you don't get ADL. Type/variable/concept/... names don't need qualification because they never do ADL.
Also, almost certainly `_VSTD::__sort3_internal<_Compare>`, right? We should have a test for "no unnecessary copies of predicates" that catches this, and if it's not failing on this PR then it needs a bit of investigation why not.


================
Comment at: libcxx/include/__algorithm/sort.h:160
+inline _LIBCPP_HIDDEN
+    __enable_if_t<!_VSTD::is_integral< typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+    __sort3_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3, _Compare __c) {
----------------
philnik wrote:
> Is there a reason you restricted this to integrals? Does it pessimize some code? I guessing at least all primitive types should be allowed. I haven't benchmarked, but looking at code generation it seems floating points should also benefit from this change.
`is_scalar` might be relevant.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118029/new/

https://reviews.llvm.org/D118029



More information about the libcxx-commits mailing list