[PATCH] D55045: Add a version of std::function that includes a few optimizations.

Samuel Benzaquen via Phabricator reviews at reviews.llvm.org
Fri Dec 7 10:11:20 PST 2018

sbenza added inline comments.

Comment at: include/functional:1673
+    mutable typename aligned_storage<2 * sizeof(void*)>::type __small;
+    void* __large;
Yesterday richardsmith was telling us that we should not need or want to use aligned_storage for pretty much anything.
That we should char arrays directly, like
alignas(8) char __small[2 * sizeof(void*)];

btw, I don't think we want the default alignment from aligned_storage here. That can be 16, which is an overkill and will propagate to any other class that contains a `std::function`. We know that some allocators will be slower when requested alignment is larger than 8.

I don't mind if some functors end up in the heap because of our tighter alignment requirements. It will still be a win for all the other functors that have smaller alignment.

Comment at: include/functional:1774
+template <typename _Tp>
+using __fast_forward =
+    typename _VSTD::conditional<_VSTD::is_scalar<_Tp>::value, _Tp, _Tp&&>::type;
jsoyke wrote:
> This optimization wasn't in the last version, sbenza@ mentioned we might want to do it. It helps with the Invoke benchmarks, as much as 10-20%.
> It's probably worth mentioning that this can  hurt performance in cases where the caller is using std::forward, or passing small value types by reference, but these are very uncommon (at least in my experience).
Can you show how this can hurt?
Do you have a benchmark that gets better when this is removed?

This will only trigger if the signature of the function itself takes by value.
`std::function::operator()` will have it by value already, and the user callback will take it by value.
If we pass by ref we will be forced to put it on the stack on a different address even if the caller passed an lvalue because `operator()`'s arguments forced a copy.

We will need to read that data and pass it in by value anyway. We are only moving that lvalue-to-rvalue decay before the vtable instead of in the vtable.


  rCXX libc++



More information about the libcxx-commits mailing list