[libcxx-commits] [PATCH] D130631: [libc++] Implement P0288R9 (move_only_function)

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Sep 29 10:01:36 PDT 2022


ldionne requested changes to this revision.
ldionne added subscribers: jwakely, fhahn, rsmith.
ldionne added a comment.
This revision now requires changes to proceed.

In the interest of making progress on this, let's do the following concrete action items in addition to what I left in the comments:

1. Extract a separate class to handle the SBO.
2. Survey other implementations and also other open source projects to see what approach they've taken.

The tricky thing with this patch is that several design questions are intertwined: support for trivial-relocation, size of the SBO and how to store the "virtual functions" are all interdependent.



================
Comment at: libcxx/include/__config:141-142
 #    define _LIBCPP_ABI_DO_NOT_EXPORT_TO_CHARS_BASE_10
+// Enable clang::trivial_abi for std::move_only_function
+#    define _LIBCPP_ABI_MOVE_ONLY_FUNCTION_TRIVIAL_ABI
 #  elif _LIBCPP_ABI_VERSION == 1
----------------
Since `move_only_function` is new, why do we need an ABI macro to control this?

Edit: @philnik tells me that's because GCC doesn't support the attribute. What a shame. Since we want to keep code ABI compatible between Clang and GCC (when compiled using libc++), I guess we do need to hide that.

@philnik Can you please file a bug against GCC to start supporting the `trivial_abi` attribute and paste a link here? I think this is a pretty serious missed optimization that would be so easy to get. CC @jwakely for awareness.


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:1
+//===----------------------------------------------------------------------===//
+//
----------------
General comment 1:

I think we should extract the logic to implement the small buffer optimization into its own piece instead of implementing it inside `move_only_function`. This isn't the last time we'll have to do this kind of optimization, and in fact it would have been nice to have this when `std::function` was written too. I can help with that, I have a local copy of an implementation of a `__sbo` class I had written when I did my own stab at `any_invocable` (which is now `move_only_function`), a long time ago.


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:5
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
----------------
General comment 2:

Fundamentally, we need to decide whether we're going to implement the function call dispatching using a classic vtable, or generate a "vtable" ourselves (either inlined in the object like you do right now, or not). This has to be thought through, but my initial preference would be to go for a manual vtable with one layer of indirection, since IIRC that's what I had the best results with when I investigated this with Dyno. IIRC, the observation was that you don't really pay for the additional level of indirection if you're calling the function over and over again, since it's hot in the cache anyway. Instead, you're better to just reduce the size of the object. This is a really tough call to make though, because we're operating with a lot of assumptions and we have to make this work well for all possible workloads, which is simply impossible.

Probably relevant things to look at (sorry for the plug): https://github.com/ldionne/dyno and http://ldionne.com/cppnow-2018-runtime-polymorphism. It's been a while but I had done a survey of various techniques in that area and I think the conclusions might still be relevant, even if that's a little bit dated.


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:64
+template <class...>
+class move_only_function;
+
----------------
General comment: Please make sure you have `_LIBCPP_ASSERT`s in place for UB in this class, and tests that these assertions work as intended.


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:122
+
+    __call_ = &__function_wrappers<_Func>::__call;
+    if constexpr (!is_trivially_destructible_v<decay_t<_Func>>) {
----------------
philnik wrote:
> Can the indirection be avoided for function pointers?
The only way I can see that working is to `reinterpret_cast` the function pointer itself to the type that `__call_` expects, and assign it to `__call_` directly. However, that would assume that the `__function_wrappers<_Func>::__call` thunk is truly a no-op, which I think is always the case.

I think this is technically UB though, because you have to cast the function pointer back to its real type before calling it (which is exactly the point of the thunk). This is something that's extremely hard to express within the language, however it's also something I would perhaps expect to be trivial to optimize for the compiler. After all, the compiler will basically see a thunk that does absolutely nothing except jump somewhere else. I would expect that the compiler can just hoist one level of indirection.

Note: As pointed out by @philnik in live review, indeed functions can't have the same address, so I don't think this is an optimizations that compilers can perform. That being said, I think this would be doable if we had a way to tell the compiler that we don't care about the address of that object. I'm just thinking out loud, but should be we allowed to apply `[[no_unique_address]]` to functions? @rsmith @fhahn Any thoughts?


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:243
+
+  static constexpr size_t __buffer_size_ = 4 * sizeof(void*);
+
----------------
IMO this is too large. I believe that types with `sizeof(T) == 32` AND that are trivially move constructible/destructible AND that we want to type erase are going to be quite rare. I think that most types will be either smaller, or will be non-trivial. Shooting from the hip, I would use something like `2*sizeof(void*)` at most, but we definitely need to look at what other implementations are doing, and also what other common utilities like this do (let's do a quick survey in open source libraries, e.g. have Folly or Abseil already shipped something like this, and if so, is there something we can learn from their experience?).


================
Comment at: libcxx/include/__functional/move_only_function_impl.h:251
+  static constexpr bool __fits_in_buffer_impl =
+      is_trivially_move_constructible_v<_Func> && is_trivially_destructible_v<_Func> &&
+      sizeof(_Func) <= __buffer_size_ && alignof(_Func) <= alignof(void*);
----------------
I think we should make it possible to store non-trivially-fooable types in the small buffer. IIUC you only need that constraint in order to implement `swap`, but you should be able to handle that by adding yet another function to the vtable. This would create a size issue right now, but wouldn't be a concern anymore if we move to a remote vtable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130631/new/

https://reviews.llvm.org/D130631



More information about the libcxx-commits mailing list