[libcxx-commits] [PATCH] D128611: [libc++] Implement `std::ranges::merge`

Sun Jun 26 15:46:16 PDT 2022

huixie90 added inline comments.

================
Comment at: libcxx/include/__algorithm/ranges_merge.h:42
+    template <
+        std::input_iterator _InIt1,
+        std::sentinel_for<_InIt1> _Sent1,
----------------
philnik wrote:
> philnik wrote:
> > Please don't annotate these unnecessarily.
> Please use `_InIter` and `_OutIter`. We discussed this on discord a while back and decided on these.
Could you please clarify what do you mean by "annotate"? If it means the `std::`, I am happy to remove it. Or if I miss anything?

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:37-38
+
+template <class... Args>
+concept canMerge = requires(Args&&... args) { std::ranges::merge(std::forward<Args>(args)...); };
+
----------------
philnik wrote:
> Could you list the arguments and default them to something sensible? The current test is very hard to read. Also, please rename this to `HasMerge` to go with the naming style in the other algorhtm tests.
regarding "default them to something sensible", I disagree. I actually had a read on the "range_move" test and I saw this 
```
static_assert(HasMoveIt<int*>);
```
I was very confused about what it means. I thought it means it is ok to do `std::ranges::move(int*)`, which doesn't make sense to me.

I think for all the arguments that can have sensible defaults, the `std::ranges::merge` already have the defaults, e.g. comparator, projection. for all other arguments, two input iterators, two sentinels, one output iterator, they don't have sensible defaults otherwise the stl would give them. creating some defaults which changes the order of the arguments than the std::ranges::merge's arguments would make the tests even harder to read. 

Part of the problem is that this algorithm takes 5 iterator arguments, which makes reading the code really hard. But it is similar to read the actual caller's code (with the same arguments order). I would rather keep it as it is than creating new confusion.

regarding `HasMerge`, I am happy to change if it is the libcxx's convention. Personally I found `HasMerge` not very suitable for algorithms as it sounds like " type T has a member function `merge` " rather than `the algorithm merge can be called with these types"

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:100
+
+// Negative tests: !std::sentinel_for<S2, I2>
+static_assert(!canMerge<int*, int*, int*, long*, int*>);
----------------
philnik wrote:
> We have `NotSentinelForX` for this.
Could you please point me where I can find it? I couldn't find it through grep

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:108-111
+// Negative tests: !std::indirectly_copyable<I1, Out>
+static_assert(!canMerge<int*, int*, int*, int*, const int*>);
+static_assert(!canMerge<Bar*, Bar*, Foo*, Foo*, Foo*, CompareOverloadSet>);
+static_assert(!canMerge<MoveOnly*, MoveOnly*, MoveOnly*, MoveOnly*, MoveOnly*>);
----------------
philnik wrote:
> I don't think we need three tests for the same diagnostic. This is only to ensure that we have constrained the algorithm, not to test the concepts.
but we could have constrained the algorithm in different wrong ways. 

for example, if we constrained the algorithm `same_as<In1Iter, OutIter> && same_as<In2Iter, OutIter>`, the first two tests would still pass as the constraint is not met and only the 3rd test would catch it.

If we constrained the algorithm to just `std::indirectly_copyable<In2Iter, OutIter>` and forget about `In1Iter`, the 1st and 3rd tests would still pass as the constraint is not met and only the 2nd test would catch it.

If we constrain the algorithm as `same_as<iter_value_t<In1Iter>, iter_value_t<OutIter>>`, only the 1st test would catch it.

All these hypothetical constraints look reasonable at the first glance without really thinking about a bit more. so i think it would be good to have a bit more negative tests

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:207-209
+  std::array i1{0, 1, 5, 6, 9, 10};
+  std::array i2{3, 6, 7, 9, 13, 15, 100};
+  std::array expected{0, 1, 3, 5, 6, 6, 7, 9, 9, 10, 13, 15, 100};
----------------
philnik wrote:
> This should be tested with different inputs.
will add more tests for different inputs

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:263-307
+struct TracedCopyMove {
+  int copy_ctor   = 0;
+  int move_ctor   = 0;
+  int copy_assign = 0;
+  int move_assign = 0;
+
+  int data = 0;
----------------
philnik wrote:
> This class seems to be a little over-engineered. 
> ```
>     struct CopyOnce {
>       bool copied = false;
>       constexpr CopyOnce() = default;
>       constexpr CopyOnce(const CopyOnce& other) = delete;
>       constexpr CopyOnce& operator=(const CopyOnce& other) {
>         assert(!other.copied);
>         copied = true;
>         return *this;
>       }
>     };
> ```
> should do the trick.
I think this is insufficient.  If the algorithm is mistakenly moving the elements instead of copying the elements, with this type it would still pass because the move fallback to copy.

also if I write the test with `back_inserter` (which I think it is worth adding), I think it won't call the assignment operator and it should call the copy constructor instead

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:324-347
+    using In1s = type_list<
+        cpp20_input_iterator<int*>,
+        forward_iterator<int*>,
+        bidirectional_iterator<int*>,
+        random_access_iterator<int*>,
+        contiguous_iterator<int*> >;
+
----------------
philnik wrote:
> This seems overly complicated. It would be a lot easier to just have a few functions to make the cartesian product.
I don't think writing few functions can lead less code (as a caller) as you have to list these types anyway.

I wish I could use boost::hana but I think if we can lift this cartesian_product as a general test utilities it would be really useful. writing several concrete test functions to simulate cartesian_product wouldn't be reusable 

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:352-360
+    std::array r1{3, 6, 7, 9};
+    std::array<int, 7> out;
+    std::same_as<
+        std::ranges::
+            merge_result<std::ranges::iterator_t<std::array<int, 4>&>, std::ranges::dangling, int*>> decltype(auto)
+        result = std::ranges::merge(r1, std::array{2, 3, 4}, out.data());
+    assert(result.in1 == r1.end());
----------------
philnik wrote:
> Could you make the cartesian product of this? (Just copy-paste it and change the types.)
again, "just copy-paste" doesn't sound right to me even it is in the test. I really think we should have tested cartesian_product (for types) utility 

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:356
+        std::ranges::
+            merge_result<std::ranges::iterator_t<std::array<int, 4>&>, std::ranges::dangling, int*>> decltype(auto)
+        result = std::ranges::merge(r1, std::array{2, 3, 4}, out.data());
----------------
philnik wrote:
> Please try to avoid using meta-programming to check meta-programming. This should simply be `std::array<int, 4>::iterator`, or (IMO the better option) use `a.data() + a.size()` or a C-style array and make it `int*`.
hmm. I don't think I would call this meta-programming. I always use `std::ranges::iterator_t` because this is the "interface" function that stl gives us. We don't have to remember how to spell the exact type of the iterators of all the ranges.

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:385
+          std::ranges::merge(r1.begin(), r1.end(), r2.begin(), r2.end(), out.data());
+      verify(result, out);
+    }
----------------
philnik wrote:
> Could you just copy-paste the asserts? That makes it a lot easier to verify at a glance.
I am happy to change it if this is the libcxx convention. I don't feel very comfortable to copy-paste a block of code where I can write a simple local lambda

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:399
+  {
+    std::array r1{IntAndID{0, 0}, IntAndID{0, 1}, IntAndID{0, 2}};
+    std::array r2{IntAndID{1, 0}, IntAndID{1, 1}, IntAndID{1, 2}};
----------------
philnik wrote:
> Could you put the structs locally? `IntAndID` isn't the most helpful name. (When you do that you might as well rename it to `S`, but I don't care much)
making it local struct would make defining `operator<=>` really inconvenient (I don't think you can have hidden friend for local class). 

regarding name. it is just an int for the `==` purposes and another identifier for me to track their original order is respected. what about `IntAndOrder`?

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:400
+    std::array r1{IntAndID{0, 0}, IntAndID{0, 1}, IntAndID{0, 2}};
+    std::array r2{IntAndID{1, 0}, IntAndID{1, 1}, IntAndID{1, 2}};
+
----------------
philnik wrote:
> I also like CTAD, but here I think it would be easier to use `std::array<IntAndID, 3>`.
how can you avoid writing the `IntAndID` when you init these elements?

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:424
+
+  // Equal elements in R1 should be merged before equal elements in R2
+  {
----------------
philnik wrote:
> This seems to be a copy-paste of the last test.
Could you please clarify what exactly means?

The previous test point is to test the algorithm is stable. equal elements within the same range are kept in the same order.

This test is to test that for equal elements across two different input ranges , the element in first range should be inserted first.

The key difference is that the previous test the equal elements are within same range, but for this test, the equal elements are across two ranges.

Anyway, I think part of the confusion was caused by the bad name of the class `IntAndID`. I think I am going to create two classes for the purpose of this test

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.merge/ranges_merge.pass.cpp:517
+  }
+
+  return true;
----------------
philnik wrote:
> You are missing test cases for
> - using `std::invoke`
> - complexity
> - that the return type of the comparator is convertible to bool
> - value edge-cases (like copying zero-size ranges)
will add these tests. thanks.

for `std::invoke`, do you mean by using `std::invoke(std::ranges::merge, x, y, z)` 

================
Comment at: libcxx/test/std/library/description/conventions/customization.point.object/niebloid.compile.pass.cpp:101
 static_assert(test(std::ranges::max_element, a));
-//static_assert(test(std::ranges::merge, a, a, a));
+static_assert(test(std::ranges::merge, a, a, a));
 static_assert(test(std::ranges::min, a));
----------------
philnik wrote:
> There should also be tests in `ranges_robust_against_copying_comparators.pass.cpp` and `ranges_robust_against_copying_projections.pass.cpp`.
thanks. I wasn't aware of that

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128611/new/

https://reviews.llvm.org/D128611