[libcxx-commits] [PATCH] D123580: [libc++] Use bit field for checking if string is in long or short mode

Wed Apr 13 09:35:24 PDT 2022

ldionne requested changes to this revision.
ldionne added a subscriber: saugustine.
ldionne added a comment.
This revision now requires changes to proceed.

I really like this cleanup, I think it makes it much easier to understand what's going on. I do have some comments, but this is looking pretty good.

================
Comment at: libcxx/include/string:1486
+        if (__s > 127)
+            __libcpp_unreachable();
+        __r_.first().__s.__size_ = __s;
----------------
philnik wrote:
> Mordante wrote:
> > philnik wrote:
> > > Mordante wrote:
> > > > Mainly curious, but why not a `_LIBCPP_ASSERT`?
> > > This ensures optimal code generation. GCC doesn't have `__builtin_assume`. This is a workaround. 
> > From a maintainability PoV I like `_LIBCPP_ASSERT` better. Would it be an option to adjust `_LIBCPP_ASSERT?` the "generate" this code when using GCC?
> I agree that `_LIBCPP_ASSERT` would be nicer to read. The problem is that using this trick for `_LIBCPP_ASSERT` could lead to extra code generation in more complex scenarios. We would have to introduce something else like `_LIBCPP_ASSUME`, which should only be used for trivial cases. But for two instances I don't think this is justified.
Can you try again with `_LIBCPP_ASSERT` by making `__libcpp_assertion_handler` `noreturn`? This is how GCC implements their own `assert(...)`, so presumably there might be some optimization opportunities for them there. I think this would be much better than using `unreachable()`.

Edit: Yuck, this indeed won't help when assertions are disabled on GCC. However, looking at the codegen, it appears that the only difference is that the compiler adds a single instruction to ensure that `__s <= 127`. Unless you can show us a case where there is a significant difference, I think the way to go is to use `_LIBCPP_ASSERT` and for GCC to be able to pick it up.

================
Comment at: libcxx/include/string:695-700
+// The __endian_factor is required, because the size_type has now one bit less space.
+// If the LSB is used in short string mode we have to multiply the size by two when the it is stored and
+// divide by two when it is loaded. In long string mode we ignore it,
+// because we can assume that we always allocate an even amount of value_types.
+// If the MSB is used the max_size() is numeric_limits<size_type>::max() / 2.
+// If the factor is 2 the LSB is used and if it is 1 the MSB is used.
----------------
IMO this adds a bit of clarity.

================
Comment at: libcxx/include/string:1492
+    void __set_short_size(size_type __s) _NOEXCEPT {
+        if (__s > 127)
+            __libcpp_unreachable();
----------------
I think I understand that you based the bound on the number representable by the `__size_` bitfield. We could also establish it based on the maximum capacity of the small string buffer by using `__s >= __min_cap` instead. (`>=` because of the null terminator).

================
Comment at: libcxx/include/string:1500-1501
+    size_type __get_short_size() const _NOEXCEPT {
+        if (__r_.first().__s.__is_long_)
+            __libcpp_unreachable();
+        return __r_.first().__s.__size_;
----------------
Same comment about using `_LIBCPP_ASSERT` -- unless we have a very convincing reason not to use it, let's use it.

================
Comment at: libcxx/include/string:3222
     size_type __m = __alloc_traits::max_size(__alloc());
-#ifdef _LIBCPP_BIG_ENDIAN
-    return (__m <= ~__long_mask ? __m : __m/2) - __alignment;
-#else
-    return __m - __alignment;
-#endif
+    return ((__m <= std::numeric_limits<size_type>::max() / 2) ? __m : __m / (2 / __endian_factor)) - __alignment;
 }
----------------
After discussing it, I think we agree this is a bit easier to digest.

================
Comment at: libcxx/include/string:3226
 template <class _CharT, class _Traits, class _Allocator>
 void
 basic_string<_CharT, _Traits, _Allocator>::reserve(size_type __requested_capacity)
----------------
I see the CI is failing on backdeployment macOS arm64. You just explained to me that this was caused by a bug where we used to return an incorrect `max_size()` in Big Endian -- the test would expect that we throw `length_error`, but we don't and instead we try allocating, which leads to `bad_alloc` instead. This patch fixes that, but since `max_size()` has been inlined into the dylib, the test will fail in backdeployment.

Since it seems that we are fixing a bug as part of this patch, let's add a regression test for it. We can then mark that test (and the failing `over_max_size.pass.cpp`) as `// XFAIL: use_system_cxx_lib && target=arm64{{.+}}-apple-macosx{{11.0|12.0}}`, with an appropriate comment.

================
Comment at: libcxx/utils/gdb/libcxx/printers.py:1
 #===----------------------------------------------------------------------===##
 #
----------------
@saugustine The pretty-printer tests have been disabled on recent Clang versions since D118067. It would be nice to fix them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123580/new/

https://reviews.llvm.org/D123580