[libcxx-commits] [PATCH] D112152: [libc++] Unroll loop in std::find_if and define other std::find variants in terms of it

Thu Oct 21 09:16:44 PDT 2021

ldionne added a comment.

Re: all

Thanks a lot for chiming in. Morally, I share everybody's views here -- this is a compiler issue and we shouldn't be working around it in the library. This was reported to me and @fhahn asked me to look into whether it might be possible to implement this in the library. So I did, but I agree I'd rather not check this in if the compiler can be fixed instead. I'll let Florian explain what a compiler fix would involve.

================
Comment at: libcxx/include/__algorithm/find.h:1
 // -*- C++ -*-
 //===----------------------------------------------------------------------===//
----------------
Quuxplusone wrote:
> High-level motivation question: This performance difference that you see, is it //only// at `-O0`? I would hope that the compiler does the obvious loop unrolling at `-O2` at least. I'd also be interested in seeing a Godbolt link of libc++ versus libstdc++, and perhaps even some new benchmark as part of the PR.
This is the godbolt: https://clang.godbolt.org/z/Mccs1v7Tv

================
Comment at: libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find_if.pass.cpp:62

+int main(int, char**) {
+  test();
----------------
Mordante wrote:
> Quuxplusone wrote:
> > Mordante wrote:
> > > Since the algorithm is no longer straight forward I think it would be good to validate the number of times the predicate is evaluated. Just to make sure we didn't accidentally increase the number of evaluations.
> > +1, I think your current implementation has 4 instances of `break` that should be `return __first`, and this causes 1 extra evaluation of `__pred` in most cases.
> I too wondered about why not a `return` instead of a `break`. Initially I also expected one extra evaluation, but upon further studying the algorithm I _think_ it's correct. `switch (__last - __first)` should be larger than 3 when the `for` loop exits in one of the `break`s.
Yeah, I could `return` instead of `break` -- but I'm pretty sure the current algorithm is correct. We'll find out when I add tests counting the number of times the predicate is called -- that's a good suggestion whatever we end up doing.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112152/new/

https://reviews.llvm.org/D112152