[libcxx-commits] [libcxx] [libc++] Optimize ranges::{for_each, for_each_n} for segmented iterators (PR #132896)
Peng Liu via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Mar 28 07:00:15 PDT 2025
================
----------------
winner245 wrote:
The following is the new benchmark I got for `join_view` of `vector<vector<int>>`, where the benchmark size refers to the total number of `int` elements, and the segment size is 256 (So we have roughly `n / 256` segments).
```
--------------------------------------------------------------------------------------------------
Benchmark Before After slow-down
--------------------------------------------------------------------------------------------------
std::for_each_n(join_view(vector<vector<int>>))/8 5.44 ns 6.31 ns
std::for_each_n(join_view(vector<vector<int>>))/32 20.2 ns 20.1 ns
std::for_each_n(join_view(vector<vector<int>>))/50 39.0 ns 32.1 ns
std::for_each_n(join_view(vector<vector<int>>))/1024 618 ns 627 ns 1%
std::for_each_n(join_view(vector<vector<int>>))/8192 3769 ns 5016 ns 33%
std::for_each_n(join_view(vector<vector<int>>))/16384 7567 ns 10052 ns 32%
std::for_each_n(join_view(vector<vector<int>>))/32768 15071 ns 20122 ns 34%
std::for_each_n(join_view(vector<vector<int>>))/65536 30099 ns 40203 ns 34%
std::for_each_n(join_view(vector<vector<int>>))/131072 59801 ns 80324 ns 34%
std::for_each_n(join_view(vector<vector<int>>))/262144 120641 ns 174557 ns 45%
std::for_each_n(join_view(vector<vector<int>>))/524288 255354 ns 325490 ns 27%
std::for_each_n(join_view(vector<vector<int>>))/1048576 523049 ns 665009 ns 27%
```
It indicates a 33% slow-down due to an additional traversal of the whole sequence in `std::next(__first, __n)`. WDYT? Should we simply drop support for `join_view` optimization?
Another possibility I can try is that we can potentially speed up `std::next(__first, __n)` for segmented iterators. For a non-random access segmented ranges, if each segment is a random access range (e.g., `join_view` of `vector<vector<T>>`, `vectoir<deque<T>>`, `deque<vector<T>>`), then we can advance the iterators in O(1) time within each segment, which effectively speeds up `std::next(__first, __n)` by `segment_size` times. I think this would need a different patch.
https://github.com/llvm/llvm-project/pull/132896
More information about the libcxx-commits
mailing list