[libcxx-commits] [PATCH] D118029: Introduce branchless sorting functions for sort3, sort4 and sort5.
Arthur O'Dwyer via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Wed Jan 26 11:18:33 PST 2022
Quuxplusone added a comment.
> I noticed that the benchmarks for size 4 are generally pretty bad. Do you know if anything can be done there?
I noticed this also, but assume that we don't really care about the performance of `std::sort` on arrays of size 4 (or less). ;) If we're trading some perf on unrealistically small arrays, for better perf on large arrays, that sounds like a good tradeoff to me.
But I, also, have not particularly reviewed this code. Just glanced briefly.
================
Comment at: libcxx/include/__algorithm/sort.h:152
+inline _LIBCPP_HIDDEN
+ __enable_if_t<_VSTD::is_integral< typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+ __sort3_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3, _Compare __c) {
----------------
philnik wrote:
> Ditto below
And vice versa, `_VSTD::__sort3_internal`. All function calls (that take arguments of user-defined types, anyway) need qualification specifically so that you don't get ADL. Type/variable/concept/... names don't need qualification because they never do ADL.
Also, almost certainly `_VSTD::__sort3_internal<_Compare>`, right? We should have a test for "no unnecessary copies of predicates" that catches this, and if it's not failing on this PR then it needs a bit of investigation why not.
================
Comment at: libcxx/include/__algorithm/sort.h:160
+inline _LIBCPP_HIDDEN
+ __enable_if_t<!_VSTD::is_integral< typename iterator_traits<_RandomAccessIterator>::value_type>::value, void>
+ __sort3_internal(_RandomAccessIterator __x1, _RandomAccessIterator __x2, _RandomAccessIterator __x3, _Compare __c) {
----------------
philnik wrote:
> Is there a reason you restricted this to integrals? Does it pessimize some code? I guessing at least all primitive types should be allowed. I haven't benchmarked, but looking at code generation it seems floating points should also benefit from this change.
`is_scalar` might be relevant.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118029/new/
https://reviews.llvm.org/D118029
More information about the libcxx-commits
mailing list