[libcxx-commits] [PATCH] D117108: [libc++] Remove vector base class

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Feb 3 13:52:41 PST 2022


ldionne requested changes to this revision.
ldionne added inline comments.
This revision now requires changes to proceed.


================
Comment at: libcxx/include/vector:316
 template <class _Tp, class _Allocator>
 class __vector_base
     : protected __vector_base_common<true> // This base class is historical, but it needs to remain for ABI compatibility
----------------
philnik wrote:
> Quuxplusone wrote:
> > ldionne wrote:
> > > philnik wrote:
> > > > ldionne wrote:
> > > > > I was trying to convince myself that it wouldn't be ABI breaking to do the following:
> > > > > 
> > > > > ```
> > > > > template <bool> struct __vector_base_common { ... };
> > > > > template <class, class> struct __vector_base : private __vector_base_common<true> { /* nothing here */ };
> > > > > 
> > > > > template <class _Tp, class _Allocator>
> > > > > class vector
> > > > > #if !defined(_LIBCPP_ABI_NO_VECTOR_BASE_CLASS)
> > > > >   : private __vector_base<_Tp, _Allocator>
> > > > > #endif
> > > > > {
> > > > >   pointer                                                      __begin_;
> > > > >   pointer                                                      __end_;
> > > > >   __compressed_pair<pointer, allocator_type>                   __end_cap_;
> > > > > 
> > > > >   // use members normally here, unconditionally
> > > > > };
> > > > > ```
> > > > > 
> > > > > However, the usual pitfall of multiply inheritting from `vector<T>` applies here too. Let's assume we have:
> > > > > 
> > > > > ```
> > > > > struct IntVector1 : std::vector<int> { };
> > > > > struct IntVector2 : std::vector<int> { };
> > > > > struct Foo : IntVector1, IntVector2 { };
> > > > > ```
> > > > > 
> > > > > Before the change, the layout should be:
> > > > > 
> > > > > ```
> > > > > +---------------------------------+
> > > > > | __vector_base_common<true>      | 0 bytes, EBO'd
> > > > > | __vector_base<int>::__begin_    | 8 bytes
> > > > > | __vector_base<int>::__end_      | 8 bytes
> > > > > | __vector_base<int>::__end_cap_  | 8 bytes (disregarding allocator)
> > > > > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
> > > > > | __vector_base_common<true>      | 1 byte, can't EBO
> > > > > | __vector_base<int>::__begin_    | 8 bytes
> > > > > | __vector_base<int>::__end_      | 8 bytes
> > > > > | __vector_base<int>::__end_cap_  | 8 bytes (disregarding allocator)
> > > > > +---------------------------------+
> > > > > ```
> > > > > 
> > > > > After the change, the layout should be:
> > > > > 
> > > > > ```
> > > > > +---------------------------------+
> > > > > | __vector_base_common<true>      | 0 bytes, EBO'd
> > > > > | __vector_base<int>              | 0 bytes, EBO'd
> > > > > | vector<int>::__begin_           | 8 bytes
> > > > > | vector<int>::__end_             | 8 bytes
> > > > > | vector<int>::__end_cap_         | 8 bytes (disregarding allocator)
> > > > > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
> > > > > | __vector_base_common<true>      | 1 byte, can't EBO
> > > > > | __vector_base<int>              | 1 byte, can't EBO <<-- this is the ABI break
> > > > > | vector<int>::__begin_           | 8 bytes
> > > > > | vector<int>::__end_             | 8 bytes
> > > > > | vector<int>::__end_cap_         | 8 bytes (disregarding allocator)
> > > > > +---------------------------------+
> > > > > ```
> > > > > 
> > > > > Since `__vector_base<int>` would now be empty, it would get EBO'd the first time it's a base class, but not the second time, so there would be one extra byte because of that "spurious" base class.
> > > > > 
> > > > > Long story short, my suggestion doesn't work because it would break multiple inheritance of `std::vector`.
> > > > https://godbolt.org/z/rdMfbchKq seems to suggest that something with your tables is wrong. My current diff has still the same size with your example.
> > > Of course!!!! Thanks for challenging this, my tables were wrong indeed. See for example https://godbolt.org/z/Wdz1Kasxn.
> > > 
> > > What I did wrong is that I blindly did as if the EBO would stop kicking in for a duplicate base class. In reality, EBO doesn't kick in if it would cause the two base classes (or base class and first member) to have the same address. Basically, EBO always kicks in except if it would cause two objects to have the same address.
> > > 
> > > This is an issue with e.g. `std::reverse_iterator` (as explained in https://quuxplusone.github.io/blog/2021/05/07/std-iterator-as-a-base-class/) because `std::reverse_iterator` inherits from `std::iterator` (which is empty and hence EBO'd), and then its first member is also something that might inherit from `std::iterator` (making the two `std::iterator` subobjects overlap if EBO is used).
> > > 
> > > However, since `std::vector` is never an empty class, the address of `__vector_base_common` is never going to overlap, because there's always a couple of pointers between one `__vector_base` and the next. So EBO can always be used.
> > > 
> > > Hence, I don't think this is an ABI break at all, and in fact I now believe we can even remove the base class altogether, without any check. In that case, I would move
> > > 
> > > ```
> > > template <bool>
> > > struct __vector_base_common;
> > > 
> > > template <>
> > > struct __vector_base_common<true> {
> > >     // Both are defined in vector.cpp
> > >     _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_length_error() const;
> > >     _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_out_of_range() const;
> > > };
> > > ```
> > > 
> > > To the `.cpp` file and wouldn't define it in the headers at all. Then, we can rename `_LIBCPP_ABI_NO_VECTOR_BASE_CLASS` to `_LIBCPP_ABI_NO_VECTOR_BASE_COMMON` and have it control only whether the library provides those symbols.
> > > 
> > > I'd like to have the logic above counter checked, @Quuxplusone maybe?
> > > 
> > > Also, the same logic should apply to `std::basic_string`.
> > That sounds right to me.
> > So the new <vector> header would unconditionally //not// call into the dylib to throw exceptions; but we'd keep the exception-throwing functions present in the dylib (under `!defined(_LIBCPP_ABI_NO_VECTOR_BASE_COMMON)`) for the benefit of old .o files that were compiled against the previous release's <vector> header.
> > 
> > > Also, the same logic should apply to `std::basic_string`.
> > 
> > So I'd say the name of the ABI macro should be more general, and control both base class types together? Right now we have `_LIBCPP_ABI_NO_BASIC_STRING_BASE_CLASS` and `_LIBCPP_ABI_NO_VECTOR_BASE_CLASS`. We could make it `_LIBCPP_ABI_NO_COMMON_CONTAINER_BASE_EXPORTS` or something like that.
> I would keep them seperate. There is no technical reason, but I think `_LIBCPP_ABI_NO_COMMON_CONTAINER_BASE_EXPORTS` sounds too much like 'there are no more common base classes in containers'. That might be true, but I didn't check and don't want to guarantee it. There might be an actual use case somewhere, and having two macros doesn't really hurt anything.
Thanks for counter checking the logic, @Quuxplusone. I am neutral on renaming the macros, but I suggest we keep them separate at this point because it doesn't really hurt, the patches are written, and I'd like to ship + cherry-pick them.


================
Comment at: libcxx/include/vector:649-652
+    pointer __begin_ = nullptr;
+    pointer __end_ = nullptr;
+    __compressed_pair<pointer, allocator_type> __end_cap_ =
+        __compressed_pair<pointer, allocator_type>(nullptr, allocator_type());
----------------
Quuxplusone wrote:
> I'd feel slightly better if this unrelated NSDMI-ification were done in a separate commit; but if CI is happy then I'm happy.
I don't mind doing it in this patch, however I do wonder why we're using `__compressed_pair<pointer, allocator_type>(nullptr, allocator_type())` instead of doing

```
__compressed_pair<pointer, allocator_type> __end_cap_(nullptr, __default_init_tag());
```

which is what we used to do in the default constructor of the base class. @philnik any reason?


================
Comment at: libcxx/src/vector.cpp:18-26
+template <>
+struct __vector_base_common<true> {
+  _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_length_error() const {
+    _VSTD::__throw_length_error("vector");
+  }
+  _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_out_of_range() const {
     _VSTD::__throw_out_of_range("vector");
----------------
Quuxplusone wrote:
> These member functions are now implicitly inline, because they appear in the body of a class. FWLIW I'd feel vastly safer if they appeared outside the body of the class, so that they would not be `inline`.
> I'm guessing that `_LIBCPP_EXPORTED_FROM_ABI` is some kind of magic "anti-inline" keyword that will cause them to be generated anyway, on relevant platforms? @ldionne you think this is OK (and better than defining them out-of-line)?
`_LIBCPP_EXPORTED_FROM_ABI` doesn't do anything special with respect to the `inline`-ness of the function. So indeed, they are not getting generated and that's why the CI is failing. I suggest we be consistent with D118733 and instead do:

```
template <bool>
struct __vector_base_common {
  _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_length_error() const;
  _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_out_of_range() const;
};

template <>
void __vector_base_common<true>::__throw_length_error() const {
  _VSTD::__throw_length_error("vector");
}

template <>
void __vector_base_common<true>::__throw_out_of_range() const {
  _VSTD::__throw_out_of_range("vector");
}
```



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117108/new/

https://reviews.llvm.org/D117108



More information about the libcxx-commits mailing list