[libcxx-commits] [PATCH] D117240: [libc++] Fix bug in ranges::advance and refactor the tests

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Jan 26 13:08:44 PST 2022


ldionne accepted this revision as: libc++.
ldionne added a comment.
This revision is now accepted and ready to land.

Will ship once CI passes



================
Comment at: libcxx/include/__iterator/advance.h:171
       // If |n| >= |bound - i|, equivalent to `ranges::advance(i, bound)`.
       if (const auto __M = __bound - __i; __magnitude_geq(__n, __M)) {
         (*this)(__i, __bound);
----------------
jwakely wrote:
> Quuxplusone wrote:
> > ldionne wrote:
> > > CaseyCarter wrote:
> > > > Quuxplusone wrote:
> > > > > I would prefer to inline it at its one call site, here, rather than generalize it unnecessarily.
> > > > > 
> > > > > Unfortunately https://eel.is/c++draft/range.iter.op.advance#6.1.1 seems to mandate that even when `__n == 0`, we need to evaluate `__bound - __i` and then, if that's also zero, we must `__i = __bound` instead of simply returning immediately with `0`. Is there any blanket-wording loophole that gets us out of that obligation? This code could get much simpler if we were allowed to put `if (__n == 0) return 0;` on line 168.
> > > > > 
> > > > > Also, notice that if `!(bidirectional_iterator<_Ip> && same_as<_Ip, _Sp>)`, then comparing `__n <= __M` is going to be superfluous — we know `__n` can't be negative unless `(bidirectional_iterator<_Ip> && same_as<_Ip, _Sp>)`, so, if `!(bidirectional_iterator<_Ip> && same_as<_Ip, _Sp>)`, it would be really nice not to generate any test for that possibility.
> > > > > 
> > > > > https://godbolt.org/z/jE66PEP4E — I haven't really looked at this code but it sure //looks// like libc++ is doing something suboptimal right now, doesn't it?
> > > > > Unfortunately https://eel.is/c++draft/range.iter.op.advance#6.1.1 seems to mandate that even when `__n == 0`, we need to evaluate `__bound - __i` and then, if that's also zero, we must `__i = __bound` instead of simply returning immediately with `0`. Is there any blanket-wording loophole that gets us out of that obligation? This code could get much simpler if we were allowed to put `if (__n == 0) return 0;` on line 168.
> > > > 
> > > > It is unfortunate that the wording requires the difference to be performed when `n == 0`, and the assignment to be performed even when `n == 0` and `last - first == 0`, which implies `first == last` already. MSVC does compute the distance (I think `n == 0` is a weird enough corner not to be worth the cost of checking) but not the assignment when `n == 0`.  I suggest you implement what you think is most performant, and file an LWG issue to allow your implementation and ours.
> > > > 
> > > > I would prefer to inline it at its one call site, here, rather than generalize it unnecessarily.
> > > 
> > > After looking at the code generation and seeing that we are indeed paying for the generality, I agree that we should special case it: https://godbolt.org/z/Ez6W9Yfej. I'll still use a lambda just to avoid having a super complicated inlined `if` condition. I tried various implementations and yours was basically the most efficient and correct one. Even using `if-else` instead of a ternary somehow produces slightly worse code.
> > > 
> > > > [about evaluating `__bound - __i` when `__n == 0`]
> > > 
> > > I still think it's better not to generate a special check for `__n == 0`, since that should be fairly rare (as Casey mentioned).
> > > 
> > > > https://godbolt.org/z/jE66PEP4E — I haven't really looked at this code but it sure looks like libc++ is doing something suboptimal right now, doesn't it?
> > > 
> > > I looked, and I think it boils down to libstdc++ not protecting against signed integer overflow like what we fixed back in D106735. @jwakely , you might be interested in taking a look, here's a distilled version of libc++ vs libstdc++: https://godbolt.org/z/aYbq1nxhf
> > > 
> > I continue to see no reason to name `__magnitude_geq`, nor to define `__M` as `const` ( https://quuxplusone.github.io/blog/2022/01/23/dont-const-all-the-things/ ), nor to define `__M` using C++20 if-syntax instead of making it an ordinary declaration on its own line. But whatevs.
> > Vice versa, I'd prefer to see lines 154-155 either done as four lines, or done as one ternary. It's basically a (Fortran ;)) three-way arithmetic `IF`: if `a<0` do one thing, else if `a==0` do another, else if `a>0` do a third.
> > 
> > It occurs to me that (unless I screwed up the math) it can also be done branchless:
> > ```
> >     auto __M = __bound - __i;
> >     if (((__n < 0) | (__n >= __M)) & ((__n > 0) | (__n <= __M))) {
> >         __i = std::move(__bound);
> >         return __n - __M;
> >     }
> > 
> >     __i += __n;
> >     return 0;
> > ```
> > but this does not seem like better codegen on GCC, and not a clear-cut winner on Clang either, so I'd go with what you've got.
> >  I suggest you implement what you think is most performant, and file an LWG issue to allow your implementation and ours.
> 
> Yes, please do. I don't care about `n==0` but doing a pointless assignment for `bound - i == 0` is pointless.
> 
> >  @jwakely , you might be interested in taking a look
> 
> Thanks, I'd missed the reflector thread about it last July.
@Quuxplusone 

Ah yes, I meant to use a single ternary, that's the best codegen of the two. I'm not going to do it branchless though, it's too obscure.

Regarding using `__magnitude_geq`: it allows me to document what this long condition is doing and arguably makes the code easier to read.


================
Comment at: libcxx/test/std/iterators/iterator.primitives/range.iter.ops/range.iter.ops.advance/iterator_count_sentinel.pass.cpp:154
     }
   }
 
----------------
Quuxplusone wrote:
> ldionne wrote:
> > Quuxplusone wrote:
> > > This is obviously better than it was. But I'd love to see some simple regression tests like we were doing on Discord:
> > > ```
> > > int a[10];
> > > int *p;
> > > const int *pc;
> > > using ConstPtr = const int*;
> > > 
> > > // forward with same type
> > > p = a+5; assert(std::ranges::advance(p, 0, a+7) == 0); assert(p == a+5);
> > > p = a+5; assert(std::ranges::advance(p, 1, a+7) == 0); assert(p == a+6);
> > > p = a+5; assert(std::ranges::advance(p, 2, a+7) == 0); assert(p == a+7);
> > > p = a+5; assert(std::ranges::advance(p, 3, a+7) == 1); assert(p == a+7);
> > > 
> > > // forward with different, non-assignable sentinel
> > > p = a+5; assert(std::ranges::advance(p, 0, ConstPtr(a+7)) == 0); assert(p == a+5);
> > > p = a+5; assert(std::ranges::advance(p, 1, ConstPtr(a+7)) == 0); assert(p == a+6);
> > > p = a+5; assert(std::ranges::advance(p, 2, ConstPtr(a+7)) == 0); assert(p == a+7);
> > > p = a+5; assert(std::ranges::advance(p, 3, ConstPtr(a+7)) == 1); assert(p == a+7);
> > > 
> > > // forward with different, assignable sentinel
> > > pc = a+5; assert(std::ranges::advance(pc, 0, a+7) == 0); assert(pc == a+5);
> > > pc = a+5; assert(std::ranges::advance(pc, 1, a+7) == 0); assert(pc == a+6);
> > > pc = a+5; assert(std::ranges::advance(pc, 2, a+7) == 0); assert(pc == a+7);
> > > pc = a+5; assert(std::ranges::advance(pc, 3, a+7) == 1); assert(pc == a+7);
> > > 
> > > // backward with same type
> > > p = a+5; assert(std::ranges::advance(p, 0, a+3) == 0); assert(p == a+5);
> > > p = a+5; assert(std::ranges::advance(p, -1, a+3) == 0); assert(p == a+4);
> > > p = a+5; assert(std::ranges::advance(p, -2, a+3) == 0); assert(p == a+3);
> > > p = a+5; assert(std::ranges::advance(p, -3, a+3) == -1); assert(p == a+3);
> > > 
> > > // backward with different type is UB so don't test it
> > > ```
> > > This covers all the cases, I think. The only other thing that needs testing is whether we assign from Sent to It in all the right places (and not in the wrong places); that'll require a sentinel type with a user-defined `operator=` so we can see that it's being called.
> > > The only other thing that needs testing is whether we assign from Sent to It in all the right places (and not in the wrong places); that'll require a sentinel type with a user-defined `operator=` so we can see that it's being called.
> > 
> > Actually, in order to test that, I would need to define a whole random access iterator with an `operator=` that takes a sentinel, and count the assignments from the iterator's `operator=`. Otherwise, to only define a sentinel, I have to fall back to some trick like the conversion operator I've used elsewhere, but it wouldn't be clean to count the number of times the sentinel gets converted to the iterator type as a proxy for counting the number of assignments. In other words, defining only the sentinel as a shortcut would be brittle IMO.
> > 
> > Given that the intent is for the Standard not to force us to evaluate those anyway, and given that we are already testing this "from the side" by counting operations with `stride_counting_iterator` to ensure that we don't increment when we should be assigning, I think there is diminishing returns in adding this test. Let me know if you agree -- if not, I'll add it.
> > 
> I agree that the returns aren't worth it. I didn't test an instrumented `operator-` in D117940 either.
> 
> But please compare against my tests in D117940 and see if they trigger anything you want to add over here too.
> Since `ranges::distance` became an LWG-worthy rabbit hole, it occurs to me that you should also test
> ```
> // backward with array as sentinel
> p = a+5; assert(std::ranges::advance(p, 0, a) == 0); assert(p == a+5);
> p = a+5; assert(std::ranges::advance(p, -1, a) == 0); assert(p == a+4);
> p = a+5; assert(std::ranges::advance(p, -5, a) == 0); assert(p == a);
> p = a+5; assert(std::ranges::advance(p, -6, a) == -1); assert(p == a);
> ```
> I expect this to be fine today (because the sentinel is taken by value and that //should// never ever change), but it'll protect against its getting broken tomorrow (when LWG decides to change the thing that shouldn't change ;)).
I've added the "basic" tests you suggested here.

It looks like we could use SFINAE tests in all of `prev`, `next` and the various `advance` overloads, but it doesn't belong in this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117240/new/

https://reviews.llvm.org/D117240



More information about the libcxx-commits mailing list