[libcxx-commits] [PATCH] D116334: [libc++] Remove std::basic_string's base class in ABIv2

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Jan 11 09:05:38 PST 2022


ldionne requested changes to this revision.
ldionne added a comment.
This revision now requires changes to proceed.

In D116334#3211435 <https://reviews.llvm.org/D116334#3211435>, @Mordante wrote:

> I like to hear @ldionne's opinion why it wasn't done in D111173 <https://reviews.llvm.org/D111173>

Probably oversight. In D111173 <https://reviews.llvm.org/D111173>, I was solving a specific problem I saw internally, and I had no reason to expand the scope of the patch to removing the base class in the unstable ABI.

I am favourable to this change, but we'll have to tackle my comment (or show me I'm wrong).

Also, perhaps we should have a similar ABI flag for `std::vector`, which does the same thing?



================
Comment at: libcxx/include/__config:111
+// Remove basic_string common base
+#  define _LIBCPP_ABI_STRING_REMOVE_BASE
 #elif _LIBCPP_ABI_VERSION == 1
----------------
Quuxplusone wrote:
> Mordante wrote:
> > This looks more consistent with the other ABI macros.
> I agree that the name shouldn't involve `REMOVE`; it should describe the new state of affairs, not just a diff against the obsolete state of affairs. FWLIW, I'd have said `_LIBCPP_ABI_NO_BASIC_STRING_BASE_CLASS` — `NO_STRING_BASE` just seems too short/vague. But I acknowledge I'm suffering from a //little// bit of "obscure things should have obscure scary names" (the same reason we've written `template<class ...>` in front of templates for 30 years, you know), and `_LIBCPP_ABI_NO_STRING_BASE` //is// consistent with `_LIBCPP_ABI_NO_BINDER_BASES` and `_LIBCPP_ABI_NO_ITERATOR_BASES`, both of which I myself introduced.
> 
> Also, I notice that `_LIBCPP_ABI_ALTERNATE_STRING_LAYOUT` already exists. I suspect we //cannot// combine these two ABI flags into one, because that would mess with people who are (unsupportedly) a-la-carte'ing that flag today. But I'd like to hear @ldionne's take on that idea anyway.
> 
> For reference, right now our two string-related ABI flags are `_LIBCPP_ABI_ALTERNATE_STRING_LAYOUT` and `_LIBCPP_ABI_STRING_OPTIMIZED_EXTERNAL_INSTANTIATION`. Yet another option for this one would be `_LIBCPP_ABI_STRING_NO_BASE_CLASS`.
> 
> TLDR: I'm reasonably happy with @Mordante's suggested edit. ;)
> Also, I notice that `_LIBCPP_ABI_ALTERNATE_STRING_LAYOUT` already exists. I suspect we cannot combine these two ABI flags into one, because that would mess with people who are (unsupportedly) a-la-carte'ing that flag today. But I'd like to hear @ldionne's take on that idea anyway.

Indeed, ABI flags are meant to be a-la-carte'ed. For example, when we shipped libc++ on ARM64, we enabled the new string layout (see `__config`):

```
#if defined(__APPLE__) && !defined(__i386__) && !defined(__x86_64__) &&       \
    (!defined(__arm__) || __ARM_ARCH_7K__ >= 2)
#  define _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT
#endif
```

So individual ABI flags need to stay stable once they are introduced.

Regarding the name, I much prefer `_LIBCPP_ABI_NO_BASIC_STRING_BASE_CLASS` too. It more clearly explains what it does.



================
Comment at: libcxx/include/string:1724
     void __throw_out_of_range() const {
-#ifndef _LIBCPP_NO_EXCEPTIONS
-        __basic_string_common<true>::__throw_out_of_range();
-#else
-        _VSTD::abort();
-#endif
+        _VSTD::__throw_out_of_range("basic_string");
     }
----------------
philnik wrote:
> Mordante wrote:
> > Quuxplusone wrote:
> > > (1) My understanding is that the `_LIBCPP_NO_EXCEPTIONS` branch needs to remain. Someone who compiles with `-D_LIBCPP_NO_EXCEPTIONS` must not see any calls to `__throw_out_of_range` popping up at link time.
> > > 
> > > (2) I have the (more nebulous) impression that we call a dedicated library function `__basic_string_common<true>::__throw_out_of_range()` instead of `__throw_out_of_range("basic_string")` on purpose, because `std::string` is used so heavily, and we don't want the user to pay for the 13-byte string literal `"basic_string"` in every single one of their TUs. (This rationale does not apply in lesser-used places such as `deque`.) However, off the top of my head I don't know that we have ever benchmarked this or that anyone really cares.
> > (1) No that branch can be removed. I had the same initial feeling but `include/stdexcept` has the branch.
> > 
> > (2) That may be an issue indeed. I wonder whether the linker is/can be smart enough to remove the duplicates.
> > 
> I ran a few tests now. My general conclusion would be to put the throws directly into the header. The TU size doesn't really matter I think, because even changing the build location can have a bigger impact on the build directory size. (At least with CMake)
> 
> I used `export LINK_FLAGS="-nostdlib++ -L$LLVM/default/build/lib -Wl,-rpath,$LLVM/default/build/lib -lc++"; CC=clang CXX=clang++ cmake -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang" -DCMAKE_BUILD_TYPE=Release ../../../llvm/ -DCMAKE_CXX_FLAGS="-nostdinc++ -nostdlib++ -isystem $LLVM/default/build/include/c++/v1" -DCMAKE_EXE_LINKER_FLAGS="$LINK_FLAGS" -DCMAKE_SHARED_LINKER_FLAGS="$LINK_FLAGS" -DLLVM_USE_LLD=On -DLLVM_ENABLE_EH=On -DLLVM_ENABLE_RTTI=On (-DLLVM_ENABLE_LTO=Thin)` to generate the build.
> I got these results:
> 
> 
> | throw in header | ThinLTO | clang-14 binary size (bytes)  |
> | yes | no | 163250384 |
> | no | no | 163250400 |
> | yes | yes | 182829280 |
> | no | yes | 182839312 |
> 
> I guess that especially with ThinLTO the functions can now be inlined or eliminated.
> I also timed the first tests but there was no significant difference between the two.
If you read the description in D111173, I explain that one of the ideas behind hiding the exception-throwing-business behind a function call is this:

> Encapsulate the code to throw an exception (which is non-trivial) in an externally-defined function so that the important code paths that call it (e.g. `vector::at`) are free from that code. Basically, the intent is for the "hot" code path to contain a single conditional jump (based on checking the error condition) to an externally-defined function, which handles all the exception-throwing business.

If you look at the code generation for `std::vector::at` before and after your change, you should see that it's more complicated after the change, because there's code to throw the exception. Before, it was a conditional jump to the function defined in the dylib, which then contained that business. I remember looking at the codegen and coming to that conclusion.

This patch shouldn't regress that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116334/new/

https://reviews.llvm.org/D116334



More information about the libcxx-commits mailing list