[libcxx-commits] [PATCH] D129823: [libc++][ranges] Make range algorithms support proxy iterators

Hui via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sat Jul 16 14:42:31 PDT 2022


huixie90 accepted this revision.
huixie90 added a comment.

LGTM with green ci
we could follow up with more testing for `iter_move`



================
Comment at: libcxx/include/__algorithm/make_heap.h:28
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_AFTER_CXX11
 void __make_heap(_RandomAccessIterator __first, _RandomAccessIterator __last, _Compare& __comp) {
   using _CompRef = typename __comp_ref_type<_Compare>::type;
----------------
var-const wrote:
> huixie90 wrote:
> > we have two different patterns for these `__` functions. 
> > 1. the `__last` has a different template parameter type. and the `__` calls `next` to get the last iterator. the ranges counter part directly call it.
> > 2. the `__last` has the same type and the ranges counter-part calls `next` then call into `__` function.
> > 
> > The problem is that when these two patterns co-exists, let's say `__a` takes two different types, and `__b` takes the same type. If `__a` needs to call `__b`, it could fail because `__b` expecting the same type.
> > 
> > I think we should make all algorithms consistent.
> This should be caught by testing, so I don't think it should cause any runtime issues, but it can definitely be confusing when reading the code. However, sometimes it reflects the nature of the algorithm -- for some algorithms it is essential that `__last` is a random-access or bidirectional iterator, and for those it makes sense to reflect that in the signature. For others, `__last` is only ever used to serve as a guard, so they can trivially support sentinels. Perhaps using different names rather than `__last` having two meanings would make this a little more readable?
 it won't trigger runtime issues , just compile time issues. It is not a matter of whether the `Opts::next` is called on the caller or here. I am not sure what is the best approach, I prefer to making all `__` algorithms to take an iterator and a sentinel of different types (obviously it would still work if you pass two iterators of same type), so that both classic and ranges algorithms can call it and won't worry about calling `opts::next`. It is the `__` algorithm to decide whether to call `opts::next` or not. But in any case, we could revisit later if we are getting confused


================
Comment at: libcxx/include/__algorithm/nth_element.h:76
         {
-            _VSTD::__selection_sort<_Compare>(__first, __last, __comp);
+            std::__selection_sort<_AlgPolicy, _Compare>(__first, __last, __comp);
             return;
----------------
question: does this algorithm have the "robust against copy comparator test"? right now the comparator is copied several times but I guess this is fine


================
Comment at: libcxx/include/__algorithm/pop_heap.h:39
   if (__len > 1) {
     value_type __top = std::move(*__first);  // create a hole at __first
+    _RandomAccessIterator __hole = std::__floyd_sift_down<_AlgPolicy, _CompRef>(__first, __comp_ref, __len);
----------------
should this call `iter_move`?


================
Comment at: libcxx/include/__algorithm/sort.h:44
+  using _Comp = _CompT;
+  _Comp& __comp;
+
----------------
I am slightly concerned about the reference member because if we are going to copy the comparator around we have risk of having dangling reference. but I guess the trick is that we never copy the comparators?


================
Comment at: libcxx/include/__algorithm/sort.h:54
+template <class _PolicyT, class _CompT>
+struct _WrapAlgPolicy<_PolicyT, _CompT, std::enable_if_t<std::is_same<_PolicyT, _ClassicAlgPolicy>::value>> {
+  using type = _CompT;
----------------
`> >` in the end to support c++03?


================
Comment at: libcxx/include/__algorithm/sort.h:71
+template <class _Wrapped>
+struct _UnwrapAlgPolicy<_Wrapped, std::enable_if_t<!std::is_same<typename _Wrapped::_AlgPolicy, int>::value>> {
+  using _AlgPolicy = typename _Wrapped::_AlgPolicy;
----------------
where does the `int` come from? How can a policy be an `int`?


================
Comment at: libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp:14
+//
+// Range algorithms should work with proxy iterators. For example, the implementations should use `iter_swap` (which is
+// a customization point) rather than plain `swap` (which might not work with certain valid iterators).
----------------
var-const wrote:
> huixie90 wrote:
> > It is great to have a test for `iter_swap`. But this test won't cover the `iter_move`. If an algorithm doesn't use `iter_move` but use `std::move(*it)`, it would still compile (for copyable types) because `value_type x = std::move(*it)` would probably ends with copies.
> > 
> > For algorithms that do move stuff, we should probably test them with proxy iterator of move only types so that it will catch incorrect use of `std::move`
> This is a great point and actually something I thought about -- we should probably test _all_ customization points. I'd prefer to do this in a follow-up, though -- I think this patch is already quite big and it makes the coverage strictly better even if it doesn't achieve one hundred percent coverage. Does that sound reasonable to you? To be clear, I plan to do the follow-up immediately after landing this. I want to land this patch ASAP because the changes to `sort.h` and friends are going to cause a lot of merge conflicts for the not-yet-implemented algorithms (e.g. `inplace_merge`).
> 
> By the way, do you think we should try to add all checks to the same `ProxyIterator` class or create a separate test iterator (and sentinel and range) for each customization point?
not all algorithms support move only types. and I think it is sufficient to use `Proxy<MoveOnly>` for the test. 


================
Comment at: libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp:28-45
+
+template <class Func, std::ranges::range Input, class ...Args>
+void test(Func&& func, Input& in, Args&& ...args) {
+  func(in.begin(), in.end(), std::forward<Args>(args)...);
+  func(in, std::forward<Args>(args)...);
+}
+
----------------
var-const wrote:
> huixie90 wrote:
> > these tests are great and they can catch some problems immediately. and thanks for adding them.
> > 
> > However, since these tests don't actually verify the results of the algorithms, I think we should add additional tests in individual tests in the case-by-case manner.
> > 
> > For example, for sorting, we can test given a vector<int>, we can create a ProxyRange, sort the ProxyRange and verify the original vector is sorted
> > 
> Please see my other comment above -- in short, I think it would be great to add `ProxyIterator` as one of the iterator types that instantiate `test_iterators` across our algorithms. I wouldn't add specific test cases, though, because I don't think those can be maintained well across all the ~100 algorithms. In either case, this is something that I feel should be done in a follow-up.
we don't need to test all 100 algorithms as most algorithms won't swap elements. I do agree that this test covers 80%+ problems. I just feel more confident if we actually checking that the result is correct. I think the "sorting" algorithms are worth checking the result


================
Comment at: libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp:152
+  //test(std::ranges::prev_permutation, in, binary_pred);
+  //test(std::ranges::next_permutation, in, binary_pred);
+}
----------------
var-const wrote:
> huixie90 wrote:
> > var-const wrote:
> > > Uninitialized memory algorithms don't work with `ProxyIterator` because it doesn't satisfy `nothrow-input-iterator` that requires the result of dereferencing the iterator to be the same as `iter_reference_t`. I'm not sure if it's possible to write a different valid proxy iterator that would satisfy that constraint but still require special handling in the implementation. @huixie90 What do you think?
> > The `nothrow-xxx-iterator` explicitly requires `std::same_as<std::remove_cvref_t<std::iter_reference_t<I>>, std::iter_value_t<I>>`, which basically means `value_type& == reference`, which rules out all the proxy iterators. It makes sense for writing to uninitialized memory because we do need the address of `*it` to be something that we can write to (not the proxies). So for most uninitialized algorithms, they are designed not to work with proxies. So we can't test them.
> > 
> > However, there are two exceptions. for `uninitialized_copy` and `uninitialized_move`, the inputs and outputs are separate. So in theory the inputs can be `ProxyRange` and the outputs can be a simple array which does model  `nothrow-xxx-iterator` 
> Thanks a lot for looking into this! It's very helpful. It seems that the following constraint:
> ```
>     requires constructible_from<iter_value_t<_OutputIterator>, iter_reference_t<_InputIterator>>
> ```
> also prevents proxy iterators from being used with `uninitialized_{copy,move}`?
hmm. that is strange. 
let's say the input range is
```
std::array a{1,2,3};
ProxyRange inputs{a};
```
and output is 
```
std::array<Proxy<int>, 3> out;
```
`iter_value_t<_OutputIterator>` should be `Proxy<int>` and `iter_reference_t<_InputIterator>` should be `Proxy<int&>`
and `Proxy<int>` should be constructible from `Proxy<int&>`.
did I miss anything?


================
Comment at: libcxx/test/support/test_iterators.h:881-882
 
+  // If `T` is a reference type, the default assignment operator will be deleted.
+  constexpr Proxy& operator=(const Proxy& rhs) {
+    data = rhs.data;
----------------
var-const wrote:
> huixie90 wrote:
> > hmm.  the two overloads above should already cover this. see the test
> > https://github.com/llvm/llvm-project/blob/8922adf646eead207fb367dace80bba2abfab2ad/libcxx/test/support/test.support/test_proxy.pass.cpp#L63
> > 
> > or did I missing something?
> I don't want to spend too much time investigating this just because of the deadline coming soon (although it's a pretty interesting issue). The test fails when the assigned type is `const` (which currently isn't tested).
> 
> My (unconfirmed) suspicion is that the implicitly-generated default `operator=` is a better match when the type on the right side is `const Proxy&`. Choosing between a template and a non-template function, overload resolution prefers a non-template function if both are an equally good match; `const Proxy&` is an exact match for the signature of the default `operator=`.
> 
> Confusingly, however, this works for the move assignment operator, even though the same logic about it being a better match seems to apply. cppreference says that "A deleted implicitly-declared move assignment operator is ignored by overload resolution". Presumably this doesn't apply to a deleted implicitly-declared _copy_ assignment operator, which would explain the difference in behavior.
> 
hmm. I suspect the tests are relying on the implicit conversion from T to Proxy<T>. (my original assignment explicitly check if `rhs` is `Proxy` and your new one allows implicit conversions. The class wasn't designed for `Proxy<T> =  T` but if that makes the test easier it would be fine. I will take a look later and it shouldn't block your patch. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129823/new/

https://reviews.llvm.org/D129823



More information about the libcxx-commits mailing list