[libcxx-commits] [PATCH] D131395: [libc++] Implement `lexicographical_compare_three_way`

Mon Aug 29 09:32:16 PDT 2022

Mordante added inline comments.

================
Comment at: libcxx/include/__algorithm/lexicographical_compare_three_way.h:48
+    auto __len2    = __last2 - __first2;
+    auto __min_len = __len1 < __len2 ? __len1 : __len2;
+    for (decltype(__min_len) __i = 0; __i < __min_len;) {
----------------
avogelsgesang wrote:
> mumbleskates wrote:
> > philnik wrote:
> > > Mordante wrote:
> > > > philnik wrote:
> > > > > Mordante wrote:
> > > > > > Why not using `std::min` here? I assume it is because the types of `__len1` and `__len2` can differ.
> > > > > > 
> > > > > > I'm not happy with all the `auto`s in this code; it's hard to understand what the intended types are and whether these are the correct intended types. Please remove the `auto`s so it's clear what the types are.
> > > > > > 
> > > > > > I really would like something along the lines of
> > > > > > ```
> > > > > > __iter_diff_t<_InputIterator1> __len1 = __last1 - __first1;
> > > > > > __iter_diff_t<_InputIterator2> __len2 = __last2 - __first2;
> > > > > > auto __min = std::min<common_type_t<decltype(__len1), decltype(__len2)>(__len1, __len2));
> > > > > > ```
> > > > > > This is more verbose, but at least I can understand how the types are defined and whether this is as intended.
> > > > > > 
> > > > > I very much disagree with you here. IMO the `__iter_diff_t<_InputIterator1>` and so on just make the code more verbose and give the option to get it wrong. `__last1 - __first1` will always return a difference type. That's not exactly surprising. Also, you want to use `decltype()` for the min for some reason. I guess because otherwise it's //really// long. How does that help with understanding the code?
> > > > It helps since the rules of the type of the conditional expression are not simple. I had to verify with the standard it does the right thing. So instead of spending a few seconds to validate these 3 lines I had to spend several minutes.
> > > > 
> > > > Code should be optimized for understanding by humans, `auto` quite often saves the writer from typing a few characters. (The compiler doesn't care either way it does its thing.)
> > > > 
> > > > The verbose code helps to communicate what the author of the code intended to happen. Relaying on some (not always well understood) language rules means it's less clear for the reader to understand what the writer intended. Both may have a different understanding of these rules.
> > > > 
> > > > I also considered to use 3 types
> > > > ```
> > > > using _Len1 = __iter_diff_t<_InputIterator1>;
> > > > using _Len2 = __iter_diff_t<_InputIterator2>;
> > > > using _Min = common_type_t<_Len1, _Len2>;
> > > > 
> > > > _Len1 __len1 = __last1 - __first1;
> > > > _Len2 __len2 = __last2 - __first2;
> > > > 
> > > > _Min __min = std::min<_Min>(__len1, __len2);
> > > > ```
> > > > 
> > > > I don't mind that solution either.
> > > > It helps since the rules of the type of the conditional expression are not simple. I had to verify with the standard it does the right thing. So instead of spending a few seconds to validate these 3 lines I had to spend several minutes.
> > > 
> > > I don't see how `common_type` helps with the type of the conditional not being simple. That's exactly what `common_type` uses to get the type in this case AFAICT (https://eel.is/c++draft/type.traits#meta.trans.other-3). Plus is has a heap of other conditionals that are hard to get through.
> > > 
> > > > Code should be optimized for understanding by humans, `auto` quite often saves the writer from typing a few characters. (The compiler doesn't care either way it does its thing.)
> > > I agree that code should be written to make it easier to read. IMO littering the code with types I don't really care about makes it harder to read. i.e. I don't really care that `a - b` returns a value of type `__iter_diff_t<_InputIterator1>`, but now I have to check that you actually named the correct type. Thinking about it, `integral auto __len1 = __last1 - __first1` would be great. Not sure how much compile time overhead that would incur though. WDYT?
> > > 
> > > > The verbose code helps to communicate what the author of the code intended to happen. [Relying] on some (not always well understood) language rules means it's less clear for the reader to understand what the writer intended. Both may have a different understanding of these rules.
> > > But you are still relying on these rules through `common_type` AFAICT.
> > > 
> > The point is that, while we are still relying on the exact same semantic under the hood, we are making an effort to spell out what we expect to happen, making it explicit that we thought of this case and fully intend this outcome. We can also ensure that future readers might also understand this happens when they might not have thought of it, whether in understanding what our code is doing or in making their own implementation some time in the future.
> > 
> > I like @Mordante's most recent version with the `using` declarations.
> I personally don't have an opinion here. Please come to an agreement here (or agree to disagree, and nevertheless agree with merging this code).
> 
> I am happy to update this patch in whichever way you agree on, but I want to avoid changing it forth and back (an earlier version of this review was actually written in terms of the difference type and changed based on feedback in this review)
> > It helps since the rules of the type of the conditional expression are not simple. I had to verify with the standard it does the right thing. So instead of spending a few seconds to validate these 3 lines I had to spend several minutes.
> 
> I don't see how `common_type` helps with the type of the conditional not being simple. That's exactly what `common_type` uses to get the type in this case AFAICT (https://eel.is/c++draft/type.traits#meta.trans.other-3). Plus is has a heap of other conditionals that are hard to get through.

It's at least clear that it's intended to use `common_type` and not one of the other rules of the conditional expression. IMO auto should never be used with the conditional expression.

> 
> > Code should be optimized for understanding by humans, `auto` quite often saves the writer from typing a few characters. (The compiler doesn't care either way it does its thing.)
> I agree that code should be written to make it easier to read. IMO littering the code with types I don't really care about makes it harder to read. i.e. I don't really care that `a - b` returns a value of type `__iter_diff_t<_InputIterator1>`, but now I have to check that you actually named the correct type. Thinking about it, `integral auto __len1 = __last1 - __first1` would be great. Not sure how much compile time overhead that would incur though. WDYT?

I care about these types when I try to understand the code and validate whether the author wrote the code correctly. 
It takes me as reader a lot longer to validate the code. Auto has its uses but it shouldn't be used everywhere just to make it easy for the writer. I still like the explicit type better either directly or by using a typedef.
For the `operator-` I don't dislike this too strongly; but as said above for the conditional expression I do.

> > The verbose code helps to communicate what the author of the code intended to happen. [Relying] on some (not always well understood) language rules means it's less clear for the reader to understand what the writer intended. Both may have a different understanding of these rules.
> But you are still relying on these rules through `common_type` AFAICT.

Yes but as said above it makes it clear that `common_type` is intended to be used. (I agree `common_type` has it's complexity too.)

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.three.way/lexicographical_compare_three_way.pass.cpp:27
+
+using std::array;
+
----------------
avogelsgesang wrote:
> Mordante wrote:
> > Please use the qualified name in the tests.
> any particular reason? Would it be fine if I used a function-local `using std::array`?
To make it clear which array is used. Sometimes we use similar names as standard names. In general we don't use using globally, sometimes in a function. There usually the nested namespace being tested like `std::chrono` or the `literal`s namespaces.

================
Comment at: libcxx/test/std/algorithms/alg.sorting/alg.three.way/lexicographical_compare_three_way_comp.verify.cpp:18
+//       -> decltype(comp(*b1, *b2));
+
+#include <array>
----------------
avogelsgesang wrote:
> Mordante wrote:
> > Can you also test with iterators with invalid difference types?
> do we already have such a test utility somewhere? I couldn't find anything useful in `test_iterators.h`
I think we don't. `almost_satisfies_types.h` seems to be the better place for such an iterator.

================
Comment at: libcxx/test/support/test_comparisons.h:253
+  friend std::partial_ordering operator<=>(PartialComp, PartialComp) = default;
+};
+#endif
----------------
avogelsgesang wrote:
> mumbleskates wrote:
> > avogelsgesang wrote:
> > > Mordante wrote:
> > > > Why not complete the set with `std::strong_ordering`?
> > > because for `strong_ordering` you can simply use a plain `int`. Or should I still add it for completeness?
> > honestly you could probably just write `using StrongComp = int;`.
> > 
> > in the tests i've written so far i have used integral types for strong, floating points for partial, and only used structs for weak orderings.
> > 
> > to that end, it would be useful if PartialComp had an avenue to actually return `partial_ordering::unordered`. you could keep its member typed as an `int` and use `INT_MIN` as a sentinel for the unordered value, which could even allow us to test heterogenous orderable/unorderable values `constexpr` in gcc (which currently(?) does not allow comparing infinities and NaNs against different values in constant evaluation).
> > 
> > For additional completeness here we would add a `UserComp` struct whose `operator<=>` returns a `UserOrdering` typed value that implements the appropriate operators against literal zero; such types are useful for SFINAE testing and types that utilize `synth-three-way`.
> > complete the set with `std::strong_ordering`
> 
> done
> 
> > `PartialComp` had an avenue to actually return `partial_ordering::unordered`
> 
> done
> 
> > `UserComp` struct whose `operator<=>` returns a `UserOrdering`
> 
> Added. But now gcc-11 crashes
GCC-12 too? We don't officially support GCC-11 anymore.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131395/new/

https://reviews.llvm.org/D131395