[libcxx-commits] [PATCH] D107671: [libcxx][ranges] Add `ranges::join_view`.

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Aug 12 12:05:44 PDT 2021


ldionne accepted this revision.
ldionne added a comment.
This revision is now accepted and ready to land.

LGTM with my comments applied and caching restored.



================
Comment at: libcxx/include/__ranges/join_view.h:109
+            *__outer_,
+            __parent_->__inner_.__emplace_deref(__outer_));
+          __inner_ = ranges::begin(__inner);
----------------
tcanens wrote:
> ldionne wrote:
> > tcanens wrote:
> > > ldionne wrote:
> > > > I don't understand why the spec allows modifying a member of the parent `join_view` from the iterator. That's very sneaky and can lead to e.g. race conditions if someone does:
> > > > 
> > > > ```
> > > > join_view v = ...;
> > > > auto it1 = v.begin();
> > > > auto it2 = v.begin();
> > > > 
> > > > std::thread([=]{
> > > >   ++it1; // calls satisfy() under the hood, which modifies `v`
> > > > });
> > > > 
> > > > std::thread([=]{
> > > >   ++it2; // calls satisfy() under the hood, which modifies `v`
> > > > });
> > > > 
> > > > // sneaky race condition since you had a copy of different iterators in each thread
> > > > ```
> > > > 
> > > > Basically, I think it's quite vexing that incrementing an iterator modifies the underlying view in this case, and I don't fully understand why that is even required. What use case does that enable? Is the criteria `is_reference_v<range_reference_t<V>>` meant to mean something like "the outer range creates its inner ranges on-the-fly and they are temporary objects"? I'm having trouble following here.
> > > > 
> > > > CC @tcanens in case you can share some insight here.
> > > It's an input range for this case, so everything from the second `begin()` onwards is undefined.
> > > 
> > > Yes, this is the case where the inner ranges are created on the fly and we need to store the range //somewhere// while it is being iterated over, and the only reasonable place to store it is the view.
> > Is there a reason why we don't use the criteria `input_range<V> && !forward_range<V>` to denote "it's only an input range and I can only do a single-pass on it" instead of `!is_reference_v<range_reference_t<V>>`?
> > 
> > Also, if everything from the second `begin()` onwards is undefined, then we can only ever have one iterator to that range. So would it be valid to store the temporary inner range in the iterator itself instead of storing it in the parent `join_view`?
> `V`, the underlying range, can be anything up to random access (for instance, `iota(0, 42) | transform(to_string)`), but if it is a range of prvalue ranges, then `join_view<V>` can only ever be input.
> 
> It's not valid to store the inner range in the iterator itself without an indirection. Iterators are movable, `join_view`'s iterator must store an iterator into the inner range that is currently being iterated over, but moving a range can invalidate iterators into it.
> 
> IIRC Eric once said something along the lines of "views are lazy algorithms, and some algorithms have state". This is one of those stateful algorithms.
Okay, I think I understand now.

@zoecarver You need to restore caching, otherwise this kind of code won't work since you'll be re-evaluating `*outer` every time you increment the inner iterator:

```
std::set<int> already_called;

auto rng = std::views::iota(0, 42) | std::views::transform([&](int i) -> std::string {
    assert(!already_called.contains(i));
    already_called.insert(i);
    return std::to_string(i);
});

for (char c : std::views::join(rng)) {
    // fails without caching cause you re-evaluate the lambda every time
}
```

Please add tests for that.

Also, please add tests with a view that invalidates its iterators upon move (it could simply make a copy and hence reallocate the underlying storage to another place). If you cached `*outer` inside the iterator (instead of in the parent view), your test should fail when you move the `join_view::iterator`, because that'll move the cached value of `*outer` (and we just said that invalidates iterators into it). IIUC that's what Tim is saying above.



================
Comment at: libcxx/include/__ranges/join_view.h:340
+  template<class _Range>
+  explicit join_view(_Range&&) -> join_view<views::all_t<_Range>>;
+
----------------
Quuxplusone wrote:
> zoecarver wrote:
> > ldionne wrote:
> > > Please make sure you test the explicit-ness of this deduction guide with something like that:
> > > 
> > > ```
> > > join_view v = some_range; // should not compile
> > > join_view v(some_range); // should compile
> > > ```
> > I can't seem to figure out how to test this. It seems like both compile: https://godbolt.org/z/qefr1vxqz
> > 
> > I'll think some more and see if I can come up with something. 
> @zoecarver: Remember that SFINAE works only on immediate contexts. `requires(T r) { invokesImplicitCTAD(r); }` is invariably true, regardless of what the //body// of `invokesImplicitCTAD` looks like.
> I recall you had some macro in one of these recent reviews that was also trying to SFINAE on the body of a function (return-type-SFINAEing on the decltype of a lambda whose body contained the interesting part); again, that doesn't work. SFINAE only looks at the well-formedness of the exact expression being evaluated, not at the-bodies-of-all-the-things-that-expression-might-call.
> 
> The other problem with your example is that `X(T)` creates an implicit deduction guide. Change it to `X(int)` or `X(std::type_identity_t<T>)` to fix that part.
> 
> I don't think there's any way to test CTAD-ability with SFINAE. With a literal type, you can try something like this https://godbolt.org/z/4GbdT87Ts but right now this works //only// on Clang (not GCC nor MSVC), and it certainly doesn't help with `join_view` specifically. So the "should not compile" test would have to be a `.fail.cpp`, unless I'm mistaken (in which case I smell a blog post in the offing... ;))
You can use a `.verify.pass.cpp` test, which can contain statements.


================
Comment at: libcxx/test/std/ranges/range.adaptors/range.join.view/end.pass.cpp:1
+//===----------------------------------------------------------------------===//
+//
----------------
zoecarver wrote:
> ldionne wrote:
> > This is a general comment not only applicable to this test. We seem to be missing coverage for a lot of range "topologies" (what I mean by that is an empty range, a range with 1 subrange, 2 subranges, an empty subrange, subranges of empty sizes, etc..).
> > 
> > I don't think it is reasonable to replicate each test for all topologies, but we should add tests with something like a random access range (const and non-const) checking all topologies. What I mean here is that I don't care if you don't test a range like `[[x], [], [y, z]]` with all combinations of archetypes, but you should test it at least with a few that are relevant (that probably is restricted to const and non-const).
> Added four tests: children all have different sizes (0-4), parent is empty, parent size is one, parent and child size is one. Let me know what you think.
We also want tests like:
- Only one subrange and it is empty
- All subranges are empty
- First subrange is empty, other are not
- Only last subrange is empty

I think it would be reasonable to test those in a fashion similar to:

```
auto rng = std::vector{std::vector{...}, ..., std::vector{...}} | std::views::join;
std::vector<T> joined;
for (auto x : rng) // since we don't have ranges::copy yet
  joined.push_back(x);
assert(joined == std::vector{...});
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107671/new/

https://reviews.llvm.org/D107671



More information about the libcxx-commits mailing list