[PATCH] D48342: [libcxx] Optimize vectors construction of trivial types from an iterator range with const-ness mismatch.

Arthur O'Dwyer via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jul 3 21:03:26 PDT 2018


Quuxplusone added inline comments.


================
Comment at: libcxx/include/memory:1479
+struct __has_construct_missing
+    : false_type
+{
----------------
vsapsai wrote:
> Quuxplusone wrote:
> > vsapsai wrote:
> > > vsapsai wrote:
> > > > vsapsai wrote:
> > > > > erik.pilkington wrote:
> > > > > > vsapsai wrote:
> > > > > > > erik.pilkington wrote:
> > > > > > > > Shouldn't this be true_type?
> > > > > > > I see this is confusing and I'm still struggling how to express it. The issue is that in C++03 `__has_construct` should be something like unknown, so that neither `__has_construct` nor `! __has_construct` evaluate to true because we don't really know if allocator has construct. This case is covered by the added test, in C++03 the memcpy specialization was enabled when
> > > > > > > 
> > > > > > > ```
> > > > > > >     is_same<allocator_type, allocator<_Tp> >
> > > > > > >       || !false_type
> > > > > > > ```
> > > > > > > 
> > > > > > > So `is_same` check had no effect and we were using memcpy to convert between int and float.
> > > > > > > 
> > > > > > > I was considering using something like
> > > > > > > 
> > > > > > > ```lang=c++
> > > > > > >         typename enable_if
> > > > > > >         <
> > > > > > >             (is_same
> > > > > > >              <
> > > > > > >                 typename _VSTD::remove_const<typename allocator_type::value_type>::type,
> > > > > > >                 typename _VSTD::remove_const<_SourceTp>::type
> > > > > > >              >::value
> > > > > > > #ifndef _LIBCPP_CXX03_LANG
> > > > > > >                 || !__has_construct<allocator_type, _DestTp*, _SourceTp>::value
> > > > > > > #endif
> > > > > > >             ) &&
> > > > > > >              is_trivially_move_constructible<_DestTp>::value,
> > > > > > >             void
> > > > > > >         >::type
> > > > > > > ```
> > > > > > > 
> > > > > > > But that doesn't look readable to me, so I've introduced ad-hoc ternary logic with `__has_construct_missing`.
> > > > > > Oh I see, yikes! That's a pretty bad bug. I agree that this fix is best then, but can you add a comment explaining this to `__has_construct_missing` for future casual readers? Also, I think we should move the __has_construct_missing bugfix into a different (prerequisite) patch. Seems unrelated to the `const` related optimization below.
> > > > > The bug as I described isn't really present now because function signature
> > > > > 
> > > > >     __construct_range_forward(allocator_type&, _Tp* __begin1, _Tp* __end1, _Tp*& __begin2)
> > > > > 
> > > > > works as implicit `is_same` for `__begin1` and `__begin2` types. I think it is worth fixing separately and there is a bug with C++03 and custom allocators.
> > > > Instead of `__has_construct_missing` I've implemented real `__has_construct` in D48753. But it is stricter in C++03 than in C++11 and later. So it made me think that absence of `construct` with exact signature isn't a good reason to use memcpy.
> > > I was wrong. Now I think the logic for using memcpy should be
> > > 
> > >     if types are the same modulo constness
> > >     and
> > >     (
> > >         using default allocator
> > >         or
> > >         using custom allocator without `construct`
> > >     )
> > >     and
> > >     is_trivially_move_constructible
> > > 
> > > The purpose of the allocator check is to cover cases when `static construct` would end up calling not user's code but libc++ code that we know can be replaced with memcpy.
> > I'd like to request the spelling `__has_trivial_construct_and_destroy<A, T, T&&>` as described here: https://www.youtube.com/watch?v=MWBfmmg8-Yo&t=21m45s
> > I have an implementation in my fork that might be useful for comparison, although even it doesn't add that last `T&&` parameter.
> > https://github.com/Quuxplusone/libcxx/commit/34eb0b5c8f03880b94d53b877cbca384783ad65a
> > 
> > What we're *really* interested in, in most cases, is `__has_trivial_construct<A, T, T&&> && __has_trivial_destroy<A, T>`. We don't care if there happens to exist an obscure overload such as `A::construct(T*, Widget, Gadget)` as long as it is not selected. This is particularly relevant to `scoped_allocator_adaptor`, whose `construct` is trivial for primitive types but non-trivial for allocator-aware types.
> > 
> > Also, when we write out the template type parameters we care about, we can do the decltype stuff really easily, without having to "fudge" the metaprogramming and possibly get it wrong. For example, if `A::construct` is an overload set, in which case the `&_Xp::construct` on this patch's line 1492 will be ill-formed even though a non-trivial `construct` does actually exist.
> I agree with benefits of using `__has_trivial_construct_and_destroy` in general but in this case I don't see a need for `__has_trivial_destroy`. `__construct_range_forward` only copies new elements and it isn't concerned with destruction. Probably for some cases we can meld destroy+construct together but I think that is out of scope for the current patch.
> 
> Arthur, can you please give some examples of possible problems with `scoped_allocator_adaptor`. I didn't work with it closely and don't really understand your concern.
> 
> I wish I could avoid "fudging" the metaprogramming  but we need to support C++03 and I don't see other alternatives. And in C++03 `A::construct` shouldn't be an overload set so my understanding is that it is OK to use `&_Xp::construct`.
> in this case I don't see a need for `__has_trivial_destroy`

Agreed. I'm happy to split up my proposed thing into `__has_trivial_construct<Alloc, T, Args...>` (which you need for the C++11 version of this patch) and `__has_trivial_destroy<Alloc, T>` (which you do not need, so let's not add it).

> And in C++03 `A::construct` shouldn't be an overload set so my understanding is that it is OK to use `&_Xp::construct`.

N1905, Table 34, expresses the syntactic constraint in the usual way: `a.construct(p,t)` must be a well-formed expression. It doesn't say anything about "...and it can't be a template or an overload set."
My understanding is that libc++ allows you to say `decltype` even in C++03; they emulate it using `__typeof` and so on. So I'm thinking it should be fine to SFINAE on `decltype(a.construct(p,t))` directly and avoid the "fudgy" use of `&_Xp::construct`.

> can you please give some examples of possible problems with `scoped_allocator_adaptor`

It's not a "problem" per se, but a missed optimization. Consider this test case:
https://wandbox.org/permlink/YbFfJRWvS4W9OxJ5

    using SAA = std::scoped_allocator_adaptor<std::allocator<T>>;
    SAA alloc;
    std::vector<T, SAA> vec(alloc);
    vec.emplace_back(std::move(t));

When "T" is a uses-allocator type, `__has_trivial_construct<SAA, T>` is false. But when "T" is *not* a uses-allocator type (such as my `class Unaware` in the wandbox), then we would really like `__has_trivial_construct<SAA, T>` to come out as "true", because
- then we could skip `SAA::construct` and directly call T's constructor; and if T was trivially constructible,
- then we could further optimize that.

The `scoped_allocator_adaptor::construct` method always exists. It is sometimes trivial and sometimes not trivial, depending on `T`.
Here's an optimization-minded example, building on the previous example:
https://wandbox.org/permlink/jtByy3fzIOWZAVu2
Both `Aware<true>` and `Aware<false>` are trivially move-constructible. `vector` is allowed to as-if-memcpy `Aware<false>` wherever it wants. `vector` is *definitely not* allowed to as-if-memcpy `Aware<true>` wherever it wants. `vector`'s allocator type `SAA` is the same in both cases; the variable that changes is the type of `T`. So therefore, if we want to distinguish these two cases, we must make our `__has_trivial_construct<Alloc, T>` trait take `T` as a template parameter. If we go with `__has_trivial_construct<Alloc>`, we lose some potential for optimization.

I believe the same argument applies to get us all the way to `__has_trivial_construct<Alloc, T, Args...>`, but I'm too tired to think about that right now. :)


https://reviews.llvm.org/D48342





More information about the cfe-commits mailing list