[libcxx-commits] [PATCH] D124516: [libc++] Implement `std::expected` P0323R12

Eric Fiselier via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sun Nov 6 15:11:12 PST 2022


EricWF added a comment.

In D124516#3903536 <https://reviews.llvm.org/D124516#3903536>, @huixie90 wrote:

> In D124516#3900680 <https://reviews.llvm.org/D124516#3900680>, @EricWF wrote:
>
>> 
>
>
>
>> Another improvement that can be made is reducing the number of instantiations we perform in a lot of the metaprogramming. `_And`, `_Not`, and `_Or` should only be used when short circuiting is needed. In most cases here it isn't. Every single instantiation performed in expected is carried around by the compiler and in the AST for the entire translation unit. That's costly, lets avoid that cost.
>
> I've removed short circuiting for the `static_assert` cases. but most meta programming in this patch can benefit from short-circuiting as they are used in the constructors' constraints. (the most complicated one in this patch is the `__can_convert` @ldionne had some comments explaining why they are needed. (in the early revisions of this patch, there was no short-circuiting)
>
> That being said, I wrote sine simple tests that benchmark the compilation times.
>
> The first one is the case where short circuiting happens
>
>   // The 3rd requirement of `__can_convert`
>   // _Not<is_constructible<_Tp, expected<_Up, _Gp>&>>
>   // fails and short circuits
>   template <std::size_t N>
>   struct Foo {
>     Foo() {}
>   
>     template <std::size_t M>
>     Foo(Foo<M>) {}
>   
>     template <std::size_t M, class E>
>     Foo(std::expected<Foo<M>, E>) {}
>   };
>   
>   template <std::size_t N>
>   void callCtor() {
>     std::expected<Foo<N>, int> e1;
>     [[maybe_unused]] std::expected<Foo<N + 1>, int> e2(e1);
>   }
>   
>   template <std::size_t... Ns>
>   void generateTest(std::index_sequence<Ns...>) {
>     (callCtor<Ns>(), ...);
>   }
>   
>   int main() { generateTest(std::make_index_sequence<1000>{}); }
>
> - With the `_And`/`_Or` version, it took *8.9s* to compile.
> - With the `_BoolConstant<is_xxx_v && ...>` version, it took *10.0s* to compile
>
> So we see some benefits of short circuiting (out weighs the instantiation cost)
>
> Then I have another test case where all boolean requirements pass so no short circuiting happening
>
>   // All requirements of `__can_convert` pass
>   // short circuiting has no benefit
>   template <std::size_t N>
>   struct Foo {
>     Foo() {}
>   
>     template <std::size_t M>
>     Foo(Foo<M>) {}
>   };
>   
>   template <std::size_t N>
>   void callCtor() {
>     std::expected<Foo<N>, int> e1;
>     [[maybe_unused]] std::expected<Foo<N + 1>, int> e2(e1);
>   }
>   
>   template <std::size_t... Ns>
>   void generateTest(std::index_sequence<Ns...>) {
>     (callCtor<Ns>(), ...);
>   }
>   
>   int main() { generateTest(std::make_index_sequence<1000>{}); }
>
> - With the `_And`/`_Or` version, it took *8.5s* to compile.
> - With the `_BoolConstant<is_xxx_v && ...>` version, it took *7.3s* to compile
>
> So we know that short circuiting doesn't help because all booleans are `true`. And the instantiation costs make it slower.
>
> The first case we gain 1.1s and the second case we lost 1.2s. So I am on the boarder line of whether or not to do short circuiting on these constructor constraints. I think in the real case, it is likely that those `boolean`s are all `true` because if the user calls these converting constructors, it is likely to succeed instead of falling back to the constructor that takes `U&&`. That means in real user code, the short circuiting might never happen.

Thank you for looking into this so thoroughly. I should have better specified that my concerns were with meta-programming in `static_assert`s and other places where all of the conditions were likely to get evaluated. For constraints, I think the opposite is true, and lazy metaprogramming is the way to go.



================
Comment at: libcxx/include/__expected/expected.h:75-89
+  static constexpr bool __can_convert =
+      is_constructible_v<_Tp, _Uf> &&                                     //
+      is_constructible_v<_Err, _Gf> &&                                    //
+      !is_constructible_v<_Tp, expected<_Up, _Gp>&> &&                    //
+      !is_constructible_v<_Tp, expected<_Up, _Gp>> &&                     //
+      !is_constructible_v<_Tp, const expected<_Up, _Gp>&> &&              //
+      !is_constructible_v<_Tp, const expected<_Up, _Gp>> &&               //
----------------
huixie90 wrote:
> huixie90 wrote:
> > huixie90 wrote:
> > > EricWF wrote:
> > > > ldionne wrote:
> > > > > huixie90 wrote:
> > > > > > philnik wrote:
> > > > > > > ldionne wrote:
> > > > > > > > That will evaluate these `is_constructible<...>` expressions in a short-circuiting fashion.
> > > > > > > What exactly is the benefit of short-circuiting here? Is it required to satisfy the spec? If not, are you sure it's worth instantiating types and short-circuiting instead of using variables? I think instantiating types takes quite a bit longer, but I might be wrong on that.
> > > > > > This is a good question. `is_xxx_v<T>` is no longer `is_xxx<T>::value` now. It is directly calling the compiler builtin. @ldionne , what do you think?
> > > > > > 
> > > > > > Btw, I did a simple test with clang to compare the compile times between different approaches
> > > > > > 
> > > > > > Program 1
> > > > > > ```
> > > > > > #include <type_traits>
> > > > > > #include <utility>
> > > > > > 
> > > > > > constexpr std::size_t template_length = 30000;
> > > > > > template <std::size_t N> struct Foo {
> > > > > >   constexpr Foo(int)
> > > > > >     requires(N < template_length / 2)
> > > > > >   {}
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t... Ns> void test(std::index_sequence<Ns...>) {
> > > > > >   static_assert((std::is_constructible_v<Foo<Ns>, int> && ...));
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t N> void generateTest() {
> > > > > >   test(std::make_index_sequence<N>{});
> > > > > > }
> > > > > > 
> > > > > > int main() { generateTest<template_length>(); }
> > > > > > 
> > > > > > ```
> > > > > > `time clang++ a.cpp -std=c++2b -stdlib=libc++ -fbracket-depth=50000` takes about `2.00secs`
> > > > > > 
> > > > > > Program 2
> > > > > > ```
> > > > > > #include <type_traits>
> > > > > > #include <utility>
> > > > > > 
> > > > > > constexpr std::size_t template_length = 30000;
> > > > > > template <std::size_t N> struct Foo {
> > > > > >   constexpr Foo(int)
> > > > > >     requires(N < template_length / 2)
> > > > > >   {}
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t... Ns> void test(std::index_sequence<Ns...>) {
> > > > > >   static_assert((std::is_constructible<Foo<Ns>, int>::value && ...));
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t N> void generateTest() {
> > > > > >   test(std::make_index_sequence<N>{});
> > > > > > }
> > > > > > 
> > > > > > int main() { generateTest<template_length>(); }
> > > > > > 
> > > > > > ```
> > > > > > `time clang++ b.cpp -std=c++2b -stdlib=libc++ -fbracket-depth=50000` takes about `2.04secs`
> > > > > > 
> > > > > > Program 3
> > > > > > ```
> > > > > > #include <type_traits>
> > > > > > #include <utility>
> > > > > > 
> > > > > > constexpr std::size_t template_length = 30000;
> > > > > > template <std::size_t N> struct Foo {
> > > > > >   constexpr Foo(int)
> > > > > >     requires(N < template_length / 2)
> > > > > >   {}
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t... Ns> void test(std::index_sequence<Ns...>) {
> > > > > >   static_assert((std::_And<std::is_constructible<Foo<Ns>, int>...>::value));
> > > > > > };
> > > > > > 
> > > > > > template <std::size_t N> void generateTest() {
> > > > > >   test(std::make_index_sequence<N>{});
> > > > > > }
> > > > > > 
> > > > > > int main() { generateTest<template_length>(); }
> > > > > > 
> > > > > > ```
> > > > > > `time clang++ c.cpp -std=c++2b -stdlib=libc++ -fbracket-depth=50000` takes about `1.44 secs`
> > > > > > 
> > > > > > @philnik So for this simple program, short circuiting (at the half position) is helpful
> > > > > > What exactly is the benefit of short-circuiting here?
> > > > > 
> > > > > The benefit is that we avoid doing work (defined after) if we can. The work I'm thinking about is: instantiating types and/or aliases, variable templates, etc. And that also includes calling the compiler builtins, because those are not free. Sure, that's much cheaper than implementing the same logic using pure C++, however builtins still require the compiler to go perform some non-trivial logic, and that has a cost. For example, under the hood, a builtin like `__is_constructible` still has to go and check whether `T(args...)` is well-formed, and that requires doing overload resolution on `T`'s constructor, potentially SFINAEing, etc. I actually don't *know* that that's how it's implemented, but from the outside it has to in order to give the right answer.
> > > > > 
> > > > > So, TLDR: builtins are much cheaper to evaluate than the equivalent pure C++ functionality, but they still are not trivial, and that's what we save by short-circuiting. In comparison, short-circuiting requires us to implement a small constant number of additional templates (like `_Lazy`), whereas calling a single `__is_constructible` builtin could cause any number of types and functions to be instantiated.
> > > > I agree. The short circuiting in the static assert doesn't provide value.
> > > > 
> > > > Either it evaluates to the end, or it fails early and the program stops compiling.
> > > > 
> > > > So many unnecessary instantiations.
> > > > I agree. The short circuiting in the static assert doesn't provide value.
> > > > 
> > > > Either it evaluates to the end, or it fails early and the program stops compiling.
> > > > 
> > > > So many unnecessary instantiations.
> > > 
> > > 
> > > I agree. The short circuiting in the static assert doesn't provide value.
> > > 
> > > Either it evaluates to the end, or it fails early and the program stops compiling.
> > > 
> > > So many unnecessary instantiations.
> > 
> > 
> > I agree. The short circuiting in the static assert doesn't provide value.
> > 
> > Either it evaluates to the end, or it fails early and the program stops compiling.
> > 
> > So many unnecessary instantiations.
> 
> This particular thread is not about `static_assert`. Sorry for the confusion as the code has been rearranged since the beginning of the thread. This thread was about the `__can_convert`, which has lots of boolean conditions with `&&` and `||`. It is not used in `static_assert`, but used in lots of constructors' constraints.
> I do believe shorting circuiting is worth it (see Louis's comment above)
Sorry, it looked like it was attached to the static assert on my end.

For constraints I'm all for lazy evaluation (and think it may even be needed).
But for anything that's going to get evaluated fully anyway, then just do it eagerly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124516/new/

https://reviews.llvm.org/D124516



More information about the libcxx-commits mailing list