[libcxx-commits] [PATCH] D121514: [libc++][format] Improve format-arg-store.

Mark de Wever via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Mon Mar 14 14:36:03 PDT 2022


Mordante marked 9 inline comments as done.
Mordante added a comment.

Thanks for your review comments! I'll look at the other comments at another time.



================
Comment at: libcxx/include/__format/format_arg_store.h:264
+template <class _Context, class _Tp, class... _Args>
+consteval size_t __get_storage_size(size_t __size) {
+  auto __s = __make_storage_type<_Context, _Tp>();
----------------
EricWF wrote:
> Why not use a fold expression here rather than recursively instantiating all these function templates.
> Please find a way to make the number of instantiations `N`, not `N**2`
I never realized you could use fold expressions on the template types.
Thanks for your suggestion!


================
Comment at: libcxx/include/__format/format_arg_store.h:287
+
+// Most exposition only constructors in [format.arg] are noexcept.
+// But all can be, so use noexcept for this function.
----------------
EricWF wrote:
> This comment doesn't make sense to me? If the constructors can possibly be non-noexcept, does that mean they can throw? 
They don't throw. The constructor `explicit basic_format_arg(const char_type* s);` isn't no except but it doesn't throw.


================
Comment at: libcxx/include/__format/format_arg_store.h:291
+_LIBCPP_HIDE_FROM_ABI constexpr void __store_data(char* __data, _Tp&& __v, _Args&&... __args) noexcept {
+  using _Sp = typename decltype(__make_storage_type<_Context, remove_cvref_t<_Tp>>())::type;
+  if constexpr (same_as<_Sp, basic_string_view<typename _Context::char_type>>) {
----------------
EricWF wrote:
> I'm very uncomfortable with this function. How does it handle alignment? non-trivial types? Is it opaque to sanitizers?
> Does it leave us open to buffer overflow bugs/attacks? 
It doesn't handle alignment it stores all elements packed in a buffer as suggested in http://eel.is/c++draft/format#args-1

```
An instance of basic_­format_­args provides access to formatting arguments.
Implementations should optimize the representation of basic_­format_­args for a small number of formatting arguments.
[Note 1: For example, by storing indices of type alternatives separately from values and packing the former.  — end note]
```
The types that can be used are all trivial.
I haven't tested whether it works with sanitizers, but if it doesn't the CI will complain.
This should be safe against buffer overflow attacks. All sizes are determined compile-time based on the types.
If there are bugs in our implementation I expect them to be cause by the sanitizers in the CI.


================
Comment at: libcxx/include/__format/format_arg_store.h:300
+    basic_string_view<typename _Context::char_type> __s{_VSTD::data(__v), _VSTD::size(__v)};
+    _VSTD::memcpy(static_cast<void*>(__data), static_cast<const void*>(_VSTD::addressof(__s)), sizeof(_Sp));
+  } else if constexpr (same_as<_Sp, typename basic_format_arg<_Context>::handle>) {
----------------
EricWF wrote:
> How is any of this `constexpr`?
> https://godbolt.org/z/esKTzK38b
> 
> If I'm right that this isn't, it means we need a lot more tests before this lands. Because it should have been caught by tests.
> 
> 
It seems it indeed isn't. It seems this function shouldn't be a `constexpr` function. Which explains why the code works. There are no `constexpr` tests so that's the reason the tests didn't catch it either.


================
Comment at: libcxx/include/__format/format_arg_store.h:305
+  } else {
+    __storage<_Sp> __s{_VSTD::forward<_Tp>(__v)};
+    _VSTD::memcpy(static_cast<void*>(__data), static_cast<const void*>(_VSTD::addressof(__s.__value)), sizeof(_Sp));
----------------
EricWF wrote:
> This needs to `static_assert(is_trivially_copyable<_Sp>);` and do we really need to instantiate another type just to memcpy this thing?
> What are the requirements on `_Sp` generally?
Yes we need that to convert certain types. For example all signed integrals where `sizeof(_Tp) < sizeof(int)` are converted to an `int`. The stored types match the types in the `basic_format_arg` value "variant". (This variant is exposition only and not used on the code, but these are the types stored in our implementation.)


================
Comment at: libcxx/include/__format/format_arg_store.h:306
+    __storage<_Sp> __s{_VSTD::forward<_Tp>(__v)};
+    _VSTD::memcpy(static_cast<void*>(__data), static_cast<const void*>(_VSTD::addressof(__s.__value)), sizeof(_Sp));
+  }
----------------
EricWF wrote:
> `sizeof(__storage<_Sp>)` may not always be equal to `sizeof(_Sp)`.
I'm not convinced that's true. But it doesn't matter, the code copies `__s.__value`. `__s.__value` has the type `_Sp`.


================
Comment at: libcxx/include/__format/format_args.h:45
   template <class... _Args>
-  _LIBCPP_HIDE_FROM_ABI basic_format_args(
-      const __format_arg_store<_Context, _Args...>& __store) noexcept
-      : __size_(sizeof...(_Args)), __data_(__store.__args.data()) {}
+  _LIBCPP_HIDE_FROM_ABI basic_format_args(const __format_arg_store<_Context, _Args...>& __store) noexcept
+      : __size_(sizeof...(_Args)) {
----------------
EricWF wrote:
> Should this be explicit?
Unfortunately no. The Standard specifies a non-explicit single argument constructor (http://eel.is/c++draft/format#args-3).



================
Comment at: libcxx/include/__format/format_args.h:65
 private:
   size_t __size_{0};
+  const __format::__arg_meta_data* __meta_data_{nullptr};
----------------
EricWF wrote:
> Is there a reason to pack the members in this order? It seems likely to generate padding for certain arg types
I'm not entirely sure what you mean. This always stores a `size_t` and two pointers regardless of the arg types.

I'm not aware of platforms where the sizes of pointers and `size_t` differ, is there a specific platform you have in mind?


================
Comment at: libcxx/include/__format/format_args.h:66
   size_t __size_{0};
-  const basic_format_arg<_Context>* __data_{nullptr};
+  const __format::__arg_meta_data* __meta_data_{nullptr};
+  const char* __data_{nullptr};
----------------
EricWF wrote:
> Have we shipped this in a release? 
> Because I'm not sure we can change the A-BI here then.
We have shipped this, but vendors need to opt-in to it due to future API/ABI breaks.
We also advertise `<format>` as API/ABI-unstable.
There are more API/ABI breaking changes planned in the code. For example
* http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2418r2.html has been retroactively accepted in C++20 at the October 2021 plenary. (I've done some work on it, but it's not ready for review yet.)
* https://cplusplus.github.io/LWG/issue3576 is discussed in SG16.

During your absence we have identified this as an issue and we need to find a better solution to be able to implement large features while being able to ship stable code. (Ranges have the same issue. There the number of retroactively accepted papers is even larger.)


================
Comment at: libcxx/test/libcxx/utilities/format/format.arguments/format.arg/format_arg_store.pass.cpp:1
+//===----------------------------------------------------------------------===//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
----------------
EricWF wrote:
> We typically avoid testing internal details like this. Can you say more about why we really
I usually also tend to avoid that. In this case I think the test if valuable. During development of the code it make it possible to quickly test whether things worked as expected. Here I test whether the buffers have the expected sizes. When these tests fail there are possible buffer bugs. The indirect tests don't give me the same confidence level.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121514/new/

https://reviews.llvm.org/D121514



More information about the libcxx-commits mailing list