[libcxx-commits] [PATCH] D131395: [libc++] Implement `lexicographical_compare_three_way`

Tue Aug 23 16:40:16 PDT 2022

avogelsgesang added inline comments.

================
Comment at: libcxx/include/__algorithm/lexicographical_compare_three_way.h:44-45
+                  "A non-integral difference_type is undefined behavior");
+    static_assert(is_integral_v<typename iterator_traits<_InputIterator2>::difference_type>,
+                  "A non-integral difference_type is undefined behavior");
+    auto __len1    = __last1 - __first1;
----------------
philnik wrote:
> Mordante wrote:
> > At Discord we discovered concepts give nicer error messages. Can you test whether that's true here too, by using `integral` instead of `is_integral_v`?
> > 
> > Since you use two asserts, lets give additional information; alternatively merge them in one assert.
> Instead of having a lot of `static_assert`s here, maybe we want to add the exposition only concepts for `Cpp17Iterator`s from the standard somewhere and `static_assert` that here?
> Can you test whether that's true here too, by using integral instead of is_integral_v?

https://godbolt.org/z/rKdPq9joo

Given that we implemented the `integral ` as
```
concept integral = is_integral_v<_Tp>;
```

using a concept here doesn't improve the error message. It only adds a more complicated backtrace to the error message which does not really provide much value.

================
Comment at: libcxx/include/__algorithm/lexicographical_compare_three_way.h:44-45
+                  "A non-integral difference_type is undefined behavior");
+    static_assert(is_integral_v<typename iterator_traits<_InputIterator2>::difference_type>,
+                  "A non-integral difference_type is undefined behavior");
+    auto __len1    = __last1 - __first1;
----------------
avogelsgesang wrote:
> philnik wrote:
> > Mordante wrote:
> > > At Discord we discovered concepts give nicer error messages. Can you test whether that's true here too, by using `integral` instead of `is_integral_v`?
> > > 
> > > Since you use two asserts, lets give additional information; alternatively merge them in one assert.
> > Instead of having a lot of `static_assert`s here, maybe we want to add the exposition only concepts for `Cpp17Iterator`s from the standard somewhere and `static_assert` that here?
> > Can you test whether that's true here too, by using integral instead of is_integral_v?
> 
> https://godbolt.org/z/rKdPq9joo
> 
> Given that we implemented the `integral ` as
> ```
> concept integral = is_integral_v<_Tp>;
> ```
> 
> using a concept here doesn't improve the error message. It only adds a more complicated backtrace to the error message which does not really provide much value.
> [...] maybe we want to add the exposition only concepts for Cpp17Iterators from the standard somewhere and static_assert that here?

You mean

```
static_assert(__cpp17_iterator<_InputIterator1>, "calling lexicographical_compare_three_way with a non-standard-compliant iterators is undefined behavior");
static_assert(__cpp17_iterator<_InputIterator2>, "calling lexicographical_compare_three_way with a non-standard-compliant iterators is undefined behavior");
```

?

================
Comment at: libcxx/include/__algorithm/lexicographical_compare_three_way.h:48
+    auto __len2    = __last2 - __first2;
+    auto __min_len = __len1 < __len2 ? __len1 : __len2;
+    for (decltype(__min_len) __i = 0; __i < __min_len;) {
----------------
mumbleskates wrote:
> philnik wrote:
> > Mordante wrote:
> > > philnik wrote:
> > > > Mordante wrote:
> > > > > Why not using `std::min` here? I assume it is because the types of `__len1` and `__len2` can differ.
> > > > > 
> > > > > I'm not happy with all the `auto`s in this code; it's hard to understand what the intended types are and whether these are the correct intended types. Please remove the `auto`s so it's clear what the types are.
> > > > > 
> > > > > I really would like something along the lines of
> > > > > ```
> > > > > __iter_diff_t<_InputIterator1> __len1 = __last1 - __first1;
> > > > > __iter_diff_t<_InputIterator2> __len2 = __last2 - __first2;
> > > > > auto __min = std::min<common_type_t<decltype(__len1), decltype(__len2)>(__len1, __len2));
> > > > > ```
> > > > > This is more verbose, but at least I can understand how the types are defined and whether this is as intended.
> > > > > 
> > > > I very much disagree with you here. IMO the `__iter_diff_t<_InputIterator1>` and so on just make the code more verbose and give the option to get it wrong. `__last1 - __first1` will always return a difference type. That's not exactly surprising. Also, you want to use `decltype()` for the min for some reason. I guess because otherwise it's //really// long. How does that help with understanding the code?
> > > It helps since the rules of the type of the conditional expression are not simple. I had to verify with the standard it does the right thing. So instead of spending a few seconds to validate these 3 lines I had to spend several minutes.
> > > 
> > > Code should be optimized for understanding by humans, `auto` quite often saves the writer from typing a few characters. (The compiler doesn't care either way it does its thing.)
> > > 
> > > The verbose code helps to communicate what the author of the code intended to happen. Relaying on some (not always well understood) language rules means it's less clear for the reader to understand what the writer intended. Both may have a different understanding of these rules.
> > > 
> > > I also considered to use 3 types
> > > ```
> > > using _Len1 = __iter_diff_t<_InputIterator1>;
> > > using _Len2 = __iter_diff_t<_InputIterator2>;
> > > using _Min = common_type_t<_Len1, _Len2>;
> > > 
> > > _Len1 __len1 = __last1 - __first1;
> > > _Len2 __len2 = __last2 - __first2;
> > > 
> > > _Min __min = std::min<_Min>(__len1, __len2);
> > > ```
> > > 
> > > I don't mind that solution either.
> > > It helps since the rules of the type of the conditional expression are not simple. I had to verify with the standard it does the right thing. So instead of spending a few seconds to validate these 3 lines I had to spend several minutes.
> > 
> > I don't see how `common_type` helps with the type of the conditional not being simple. That's exactly what `common_type` uses to get the type in this case AFAICT (https://eel.is/c++draft/type.traits#meta.trans.other-3). Plus is has a heap of other conditionals that are hard to get through.
> > 
> > > Code should be optimized for understanding by humans, `auto` quite often saves the writer from typing a few characters. (The compiler doesn't care either way it does its thing.)
> > I agree that code should be written to make it easier to read. IMO littering the code with types I don't really care about makes it harder to read. i.e. I don't really care that `a - b` returns a value of type `__iter_diff_t<_InputIterator1>`, but now I have to check that you actually named the correct type. Thinking about it, `integral auto __len1 = __last1 - __first1` would be great. Not sure how much compile time overhead that would incur though. WDYT?
> > 
> > > The verbose code helps to communicate what the author of the code intended to happen. [Relying] on some (not always well understood) language rules means it's less clear for the reader to understand what the writer intended. Both may have a different understanding of these rules.
> > But you are still relying on these rules through `common_type` AFAICT.
> > 
> The point is that, while we are still relying on the exact same semantic under the hood, we are making an effort to spell out what we expect to happen, making it explicit that we thought of this case and fully intend this outcome. We can also ensure that future readers might also understand this happens when they might not have thought of it, whether in understanding what our code is doing or in making their own implementation some time in the future.
> 
> I like @Mordante's most recent version with the `using` declarations.
I personally don't have an opinion here. Please come to an agreement here (or agree to disagree, and nevertheless agree with merging this code).

I am happy to update this patch in whichever way you agree on, but I want to avoid changing it forth and back (an earlier version of this review was actually written in terms of the difference type and changed based on feedback in this review)

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.three.way/lexicographical_compare_three_way.pass.cpp:33
+      std::lexicographical_compare_three_way(Iter1{a.begin()}, Iter1{a.end()}, Iter2{b.begin()}, Iter2{b.end()});
+  assert(expected == result);
+}
----------------
Mordante wrote:
> I don't mind this way of testing, but since the function is expressed in terms of the version with a `compare_three_way` it would have been possible to test this version against that version.
> Something along the lines of:
> 
> ```
> auto expected = std::lexicographical_compare_three_way(Iter1{a.begin()}, Iter1{a.end()}, Iter2{b.begin()}, Iter2{b.end()}, std::compare_three_way());
> std::same_as<decltype(expected)> auto result = std::lexicographical_compare_three_way(Iter1{a.begin()}, Iter1{a.end()}, Iter2{b.begin()}, Iter2{b.end()});
> 
> assert(expected == result);
> ```
> That way we might even remove some tests here and relay on the other test to provide the coverage.
> 
not sure how much that would actually give us. Instead of 
```
test_lexicographical_compare<Iter1, Iter2>(array{0, 1}, array{0, 2}, std::strong_ordering::less);
```
I could write
```
test_lexicographical_compare<Iter1, Iter2>(array{0, 1}, array{0, 2});
```

but I would still have to enumerate all the different input arrays, iterator types etc. Given this would only save few lines, I prefer to keep the test cases as they currently are.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131395/new/

https://reviews.llvm.org/D131395