[libcxx-commits] [PATCH] D107671: [libcxx][ranges] Add `ranges::join_view`.

Zoe Carver via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Aug 12 17:31:14 PDT 2021


zoecarver added a comment.

I have a few inline comments that I want to resolve tomorrow and I also need to fix the bots. Then I'll land this.



================
Comment at: libcxx/include/__ranges/join_view.h:41
+    requires is_reference_v<_Tp>
+  struct __inner_cache<_Tp> { };
+
----------------
I know you wanted me to do 
```
static constexpr bool _UseCache = !is_reference_v<_InnerRange>;
using _Cache = _If<_UseCache, __non_propagating_cache<remove_cv_t<_InnerRange>>, __empty_cache>;
[[no_unique_address]] _Cache __cache_;
```
which I agree would be better, but I was getting compiler errors, because `__non_propagating_cache<remove_cv_t<_InnerRange>>` is not valid when `_InnerRange` is a reference type (because `__non_propagating_cache ` requires `is_object_v`). 

There may be a creative solution to work around this. 


================
Comment at: libcxx/include/__ranges/join_view.h:76
+
+    _View __base_ = _View(); // TODO: [[no_unique_address]] makes clang crash! File a bug :)
+
----------------
I spent a bit of time trying to find a small reproducer and this is as far as I got. I'll try to file a bug tomorrow.


================
Comment at: libcxx/include/__ranges/join_view.h:195
+      for (; __outer_ != ranges::end(__parent_->__base_); ++__outer_) {
+        auto&& __inner = [&]() -> auto&& {
+          if constexpr (__ref_is_glvalue)
----------------
I cannot use the standard's suggestion here because of https://bugs.llvm.org/show_bug.cgi?id=51465


================
Comment at: libcxx/include/__ranges/join_view.h:109
+            *__outer_,
+            __parent_->__inner_.__emplace_deref(__outer_));
+          __inner_ = ranges::begin(__inner);
----------------
ldionne wrote:
> tcanens wrote:
> > ldionne wrote:
> > > tcanens wrote:
> > > > ldionne wrote:
> > > > > I don't understand why the spec allows modifying a member of the parent `join_view` from the iterator. That's very sneaky and can lead to e.g. race conditions if someone does:
> > > > > 
> > > > > ```
> > > > > join_view v = ...;
> > > > > auto it1 = v.begin();
> > > > > auto it2 = v.begin();
> > > > > 
> > > > > std::thread([=]{
> > > > >   ++it1; // calls satisfy() under the hood, which modifies `v`
> > > > > });
> > > > > 
> > > > > std::thread([=]{
> > > > >   ++it2; // calls satisfy() under the hood, which modifies `v`
> > > > > });
> > > > > 
> > > > > // sneaky race condition since you had a copy of different iterators in each thread
> > > > > ```
> > > > > 
> > > > > Basically, I think it's quite vexing that incrementing an iterator modifies the underlying view in this case, and I don't fully understand why that is even required. What use case does that enable? Is the criteria `is_reference_v<range_reference_t<V>>` meant to mean something like "the outer range creates its inner ranges on-the-fly and they are temporary objects"? I'm having trouble following here.
> > > > > 
> > > > > CC @tcanens in case you can share some insight here.
> > > > It's an input range for this case, so everything from the second `begin()` onwards is undefined.
> > > > 
> > > > Yes, this is the case where the inner ranges are created on the fly and we need to store the range //somewhere// while it is being iterated over, and the only reasonable place to store it is the view.
> > > Is there a reason why we don't use the criteria `input_range<V> && !forward_range<V>` to denote "it's only an input range and I can only do a single-pass on it" instead of `!is_reference_v<range_reference_t<V>>`?
> > > 
> > > Also, if everything from the second `begin()` onwards is undefined, then we can only ever have one iterator to that range. So would it be valid to store the temporary inner range in the iterator itself instead of storing it in the parent `join_view`?
> > `V`, the underlying range, can be anything up to random access (for instance, `iota(0, 42) | transform(to_string)`), but if it is a range of prvalue ranges, then `join_view<V>` can only ever be input.
> > 
> > It's not valid to store the inner range in the iterator itself without an indirection. Iterators are movable, `join_view`'s iterator must store an iterator into the inner range that is currently being iterated over, but moving a range can invalidate iterators into it.
> > 
> > IIRC Eric once said something along the lines of "views are lazy algorithms, and some algorithms have state". This is one of those stateful algorithms.
> Okay, I think I understand now.
> 
> @zoecarver You need to restore caching, otherwise this kind of code won't work since you'll be re-evaluating `*outer` every time you increment the inner iterator:
> 
> ```
> std::set<int> already_called;
> 
> auto rng = std::views::iota(0, 42) | std::views::transform([&](int i) -> std::string {
>     assert(!already_called.contains(i));
>     already_called.insert(i);
>     return std::to_string(i);
> });
> 
> for (char c : std::views::join(rng)) {
>     // fails without caching cause you re-evaluate the lambda every time
> }
> ```
> 
> Please add tests for that.
> 
> Also, please add tests with a view that invalidates its iterators upon move (it could simply make a copy and hence reallocate the underlying storage to another place). If you cached `*outer` inside the iterator (instead of in the parent view), your test should fail when you move the `join_view::iterator`, because that'll move the cached value of `*outer` (and we just said that invalidates iterators into it). IIUC that's what Tim is saying above.
> 
> Also, please add tests with a view that invalidates its iterators upon move (it could simply make a copy and hence reallocate the underlying storage to another place).

Done. I made `ValueView`'s move constructor invalidate the moved-from iterator. This should catches both issues.

(Side note: I made the buffer in increment.pass.cpp into two buffers, because while debugging this, I found that we will just read past end, so it *looks* like there isn't a problem.)


================
Comment at: libcxx/test/std/ranges/range.adaptors/range.join.view/end.pass.cpp:1
+//===----------------------------------------------------------------------===//
+//
----------------
ldionne wrote:
> zoecarver wrote:
> > ldionne wrote:
> > > This is a general comment not only applicable to this test. We seem to be missing coverage for a lot of range "topologies" (what I mean by that is an empty range, a range with 1 subrange, 2 subranges, an empty subrange, subranges of empty sizes, etc..).
> > > 
> > > I don't think it is reasonable to replicate each test for all topologies, but we should add tests with something like a random access range (const and non-const) checking all topologies. What I mean here is that I don't care if you don't test a range like `[[x], [], [y, z]]` with all combinations of archetypes, but you should test it at least with a few that are relevant (that probably is restricted to const and non-const).
> > Added four tests: children all have different sizes (0-4), parent is empty, parent size is one, parent and child size is one. Let me know what you think.
> We also want tests like:
> - Only one subrange and it is empty
> - All subranges are empty
> - First subrange is empty, other are not
> - Only last subrange is empty
> 
> I think it would be reasonable to test those in a fashion similar to:
> 
> ```
> auto rng = std::vector{std::vector{...}, ..., std::vector{...}} | std::views::join;
> std::vector<T> joined;
> for (auto x : rng) // since we don't have ranges::copy yet
>   joined.push_back(x);
> assert(joined == std::vector{...});
> ```
I'm just using the same method as the other tests. If you want vector for some reason, let me know, but I'd rather not pull that in here. 


================
Comment at: libcxx/test/std/ranges/range.adaptors/range.join.view/iterator/increment.pass.cpp:28
+  (void) dummy;
+  int buffer2[2][4] = {{9, 10, 11, 12}, {13, 14, 15, 16}};
+
----------------
You have no idea how fun it was to debug this :P


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107671/new/

https://reviews.llvm.org/D107671



More information about the libcxx-commits mailing list