[libcxx-commits] [PATCH] D131395: [libc++] Implement `lexicographical_compare_three_way`
Kent Ross via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Sat Aug 20 14:19:30 PDT 2022
mumbleskates added inline comments.
================
Comment at: libcxx/include/__algorithm/lexicographical_compare_three_way.h:46
+ }
+ return __len1 <=> __len2;
+ } else {
----------------
philnik wrote:
> JohelEGP wrote:
> > mumbleskates wrote:
> > > avogelsgesang wrote:
> > > > JohelEGP wrote:
> > > > > philnik wrote:
> > > > > > JohelEGP wrote:
> > > > > > > philnik wrote:
> > > > > > > > avogelsgesang wrote:
> > > > > > > > > mumbleskates wrote:
> > > > > > > > > > philnik wrote:
> > > > > > > > > > > JohelEGP wrote:
> > > > > > > > > > > > huixie90 wrote:
> > > > > > > > > > > > > agree with @philnik that providing benchmark to show the difference.
> > > > > > > > > > > > >
> > > > > > > > > > > > > question: the standard specifies `return (last1 - first1) <=> (last2 - first2);` but here you are using `size_t` instead of iterator's `difference_type`. so effectively
> > > > > > > > > > > > > `return static_cast<size_t>(last1 - first1) <=> static_cast<size_t>(last2 - first2);`. Does this have an observable difference?
> > > > > > > > > > > > It should if the difference type is an integer-class type.
> > > > > > > > > > > I don't think it has any observable effects, but we shouldn't cast to different types, since that can prohibit some optimizations.
> > > > > > > > > > i'm not even 100% sure what is allowed to be assumed about `difference_type`; i would definitely avoid using `size_t` here and stick entirely to the iterator's difference type.
> > > > > > > > > Using `difference_type` now, and also only using this optimization if both iterator types have the same, integral `difference_type`
> > > > > > > > `difference_type` is guaranteed to be a signed integral (http://eel.is/c++draft/iterator.traits#2). Otherwise your program if ill-formed. I'd just use `ptrdiff_t` as the type here. No need to not optimize just because the `difference_type`s are different.
> > > > > > > > difference_type is guaranteed to be a signed integral (http://eel.is/c++draft/iterator.traits#2).
> > > > > > >
> > > > > > > Not guaranteed, but it's a requirement:
> > > > > > > > (4.1) If an algorithm's template parameter is named InputIterator, InputIterator1, or InputIterator2, the template argument shall meet the Cpp17InputIterator requirements ([input.iterators]).
> > > > > > > > -- https://eel.is/c++draft/algorithms.requirements#4.1
> > > > > > > > 1# A class or pointer type X meets the requirements of an input iterator for the value type T if X meets the Cpp17Iterator ([iterator.iterators]) and Cpp17EqualityComparable (Table 29) requirements and the expressions in Table 85 are valid and have the indicated semantics.
> > > > > > > > -- https://eel.is/c++draft/input.iterators#1
> > > > > > > > (2.2) iterator_traits<X>::difference_type is a signed integer type or void, and
> > > > > > > > -- https://eel.is/c++draft/iterator.iterators#2.2
> > > > > > >
> > > > > > > > Otherwise your program if ill-formed.
> > > > > > >
> > > > > > > Not really. It's this:
> > > > > > > > 1# In certain cases (replacement functions, handler functions, operations on types used to instantiate standard library template components), the C++ standard library depends on components supplied by a C++ program. If these components do not meet their requirements, this document places no requirements on the implementation.
> > > > > > > > 2# In particular, the behavior is undefined in the following cases:
> > > > > > > > [...]
> > > > > > > > (2.3) For types used as template arguments when instantiating a template component, if the operations on the type do not implement the semantics of the applicable Requirements subclause ([allocator.requirements], [container.requirements], [iterator.requirements], [ algorithms.requirements], [numeric.requirements]). Operations on such types can report a failure by throwing an exception unless otherwise specified.
> > > > > > > > [...]
> > > > > > > > -- https://eel.is/c++draft/res.on.functions
> > > > > > >
> > > > > > > An example for `std::lexicographical_compare_three_way` would be the iterators of an object of type `std::ranges::iota_view<std::intmax_t, std::intmax_t>` (https://eel.is/c++draft/ranges#range.iota.view-1.3).
> > > > > > > > difference_type is guaranteed to be a signed integral (http://eel.is/c++draft/iterator.traits#2).
> > > > > > >
> > > > > > > Not guaranteed, but it's a requirement:
> > > > > > What exactly is the difference between "guaranteed" and "requirement"?
> > > > > > > > Otherwise your program if ill-formed.
> > > > > > >
> > > > > > > Not really. It's this:
> > > > > > > > 1# In certain cases (replacement functions, handler functions, operations on types used to instantiate standard library template components), the C++ standard library depends on components supplied by a C++ program. If these components do not meet their requirements, this document places no requirements on the implementation.
> > > > > > In other words IFNDR.
> > > > > > > An example for `std::lexicographical_compare_three_way` would be the iterators of an object of type `std::ranges::iota_view<std::intmax_t, std::intmax_t>` (https://eel.is/c++draft/ranges#range.iota.view-1.3).
> > > > > > I'm not sure what you are trying to say here.
> > > > > > In other words IFNDR.
> > > > >
> > > > > Undefined, by p2.
> > > > >
> > > > > > What exactly is the difference between "guaranteed" and "requirement"?
> > > > >
> > > > > The latter is more formal. The former does not imply that it's definitely going to be an integral type. The library gives you license to treat the difference type as an integral type by means of it being anything else being UB. However, I didn't want to confuse the PR submitter by being hand-wavy.
> > > > >
> > > > > > I'm not sure what you are trying to say here.
> > > > >
> > > > > Just illustrating an unfortunate example for which you can't call `std::lexicographical_compare_three_way` without it being UB.
> > > > thanks for those additional details!
> > > >
> > > > In the updated version, I now use `ptrdiff_t` as suggested by @philnik. Furthermore, I changed the `is_integral` check into a static_assert. Instead of exploiting undefined behavior and (potentially silently) generating, I think we should rather diagnose it
> > > (the below comments accidentally left un-sent in draft on 8/17)
> > >
> > > sorry if my comments have been incompletely informed! i didn't remember that the function can operate on heterogeneous iterator types.
> > >
> > > if we're confident that `ptrdiff_t` is always the widest signed int we can see in any situation I'm good with that; otherwise it might be needed to choose a type that's the widest of `ptrdiff_t` and the two difference types, as a safeguard. it might de-optimize when a different type is needed but it seems like a less than optimal situation in the first place, and the alternative may simply be wrong.
> > >
> > > (end late comments)
> > >
> > > great! glad we're using `auto` here now, i do believe that natural promotion should be the most reliable way to get the widest type.
> > Is that really the right thing to do, though? You used to be able to call an unconstrained algorithm with non-conforming iterators so long as all operations actually used by the algorithm were well-behaved. Has that changed? In this case, you'd also get the correct answer with iterators with a difference type that is an integer-class type so long as the actual difference doesn't exceed the max. value of `ptrdiff_t`.
> Calling algorithms with non-conforming iterators was always and will always be UB. Just fix your iterators, it's not that hard.
i actually had a thought about testing here: would it be possible to create a regression test case that is able to demonstrate this capability?
For example, if it is possible to specify that the type of `ptrdiff_t` is `int32_t`, `ranges::iota_view<int64_t>(0, 1)` should lexically compare as less than `ranges::iota_view<int64_t>(0, 0x100000000)`. If the a narrower type were used for the lengths it would determine that the latter view was the shorter one.
if i'm actually looking at it though i'm not sure there could be any good way to actually swap the definition of `ptrdiff_t`. however, we could at least safeguard against this on architectures where `ptrdiff_t` is narrower than `intmax_t`, by comparing `ranges::iota_view(0, 1)` against `ranges::iota_view<intmax_t>(0, ((intmax_t)numeric_limits<ptrdiff_t>::max) + 1)`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131395/new/
https://reviews.llvm.org/D131395
More information about the libcxx-commits
mailing list