[libcxx-commits] [PATCH] D117108: [libc++] Remove vector base class

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Feb 1 10:35:01 PST 2022


philnik added inline comments.


================
Comment at: libcxx/include/vector:316
 template <class _Tp, class _Allocator>
 class __vector_base
     : protected __vector_base_common<true> // This base class is historical, but it needs to remain for ABI compatibility
----------------
Quuxplusone wrote:
> ldionne wrote:
> > philnik wrote:
> > > ldionne wrote:
> > > > I was trying to convince myself that it wouldn't be ABI breaking to do the following:
> > > > 
> > > > ```
> > > > template <bool> struct __vector_base_common { ... };
> > > > template <class, class> struct __vector_base : private __vector_base_common<true> { /* nothing here */ };
> > > > 
> > > > template <class _Tp, class _Allocator>
> > > > class vector
> > > > #if !defined(_LIBCPP_ABI_NO_VECTOR_BASE_CLASS)
> > > >   : private __vector_base<_Tp, _Allocator>
> > > > #endif
> > > > {
> > > >   pointer                                                      __begin_;
> > > >   pointer                                                      __end_;
> > > >   __compressed_pair<pointer, allocator_type>                   __end_cap_;
> > > > 
> > > >   // use members normally here, unconditionally
> > > > };
> > > > ```
> > > > 
> > > > However, the usual pitfall of multiply inheritting from `vector<T>` applies here too. Let's assume we have:
> > > > 
> > > > ```
> > > > struct IntVector1 : std::vector<int> { };
> > > > struct IntVector2 : std::vector<int> { };
> > > > struct Foo : IntVector1, IntVector2 { };
> > > > ```
> > > > 
> > > > Before the change, the layout should be:
> > > > 
> > > > ```
> > > > +---------------------------------+
> > > > | __vector_base_common<true>      | 0 bytes, EBO'd
> > > > | __vector_base<int>::__begin_    | 8 bytes
> > > > | __vector_base<int>::__end_      | 8 bytes
> > > > | __vector_base<int>::__end_cap_  | 8 bytes (disregarding allocator)
> > > > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
> > > > | __vector_base_common<true>      | 1 byte, can't EBO
> > > > | __vector_base<int>::__begin_    | 8 bytes
> > > > | __vector_base<int>::__end_      | 8 bytes
> > > > | __vector_base<int>::__end_cap_  | 8 bytes (disregarding allocator)
> > > > +---------------------------------+
> > > > ```
> > > > 
> > > > After the change, the layout should be:
> > > > 
> > > > ```
> > > > +---------------------------------+
> > > > | __vector_base_common<true>      | 0 bytes, EBO'd
> > > > | __vector_base<int>              | 0 bytes, EBO'd
> > > > | vector<int>::__begin_           | 8 bytes
> > > > | vector<int>::__end_             | 8 bytes
> > > > | vector<int>::__end_cap_         | 8 bytes (disregarding allocator)
> > > > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
> > > > | __vector_base_common<true>      | 1 byte, can't EBO
> > > > | __vector_base<int>              | 1 byte, can't EBO <<-- this is the ABI break
> > > > | vector<int>::__begin_           | 8 bytes
> > > > | vector<int>::__end_             | 8 bytes
> > > > | vector<int>::__end_cap_         | 8 bytes (disregarding allocator)
> > > > +---------------------------------+
> > > > ```
> > > > 
> > > > Since `__vector_base<int>` would now be empty, it would get EBO'd the first time it's a base class, but not the second time, so there would be one extra byte because of that "spurious" base class.
> > > > 
> > > > Long story short, my suggestion doesn't work because it would break multiple inheritance of `std::vector`.
> > > https://godbolt.org/z/rdMfbchKq seems to suggest that something with your tables is wrong. My current diff has still the same size with your example.
> > Of course!!!! Thanks for challenging this, my tables were wrong indeed. See for example https://godbolt.org/z/Wdz1Kasxn.
> > 
> > What I did wrong is that I blindly did as if the EBO would stop kicking in for a duplicate base class. In reality, EBO doesn't kick in if it would cause the two base classes (or base class and first member) to have the same address. Basically, EBO always kicks in except if it would cause two objects to have the same address.
> > 
> > This is an issue with e.g. `std::reverse_iterator` (as explained in https://quuxplusone.github.io/blog/2021/05/07/std-iterator-as-a-base-class/) because `std::reverse_iterator` inherits from `std::iterator` (which is empty and hence EBO'd), and then its first member is also something that might inherit from `std::iterator` (making the two `std::iterator` subobjects overlap if EBO is used).
> > 
> > However, since `std::vector` is never an empty class, the address of `__vector_base_common` is never going to overlap, because there's always a couple of pointers between one `__vector_base` and the next. So EBO can always be used.
> > 
> > Hence, I don't think this is an ABI break at all, and in fact I now believe we can even remove the base class altogether, without any check. In that case, I would move
> > 
> > ```
> > template <bool>
> > struct __vector_base_common;
> > 
> > template <>
> > struct __vector_base_common<true> {
> >     // Both are defined in vector.cpp
> >     _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_length_error() const;
> >     _LIBCPP_NORETURN _LIBCPP_EXPORTED_FROM_ABI void __throw_out_of_range() const;
> > };
> > ```
> > 
> > To the `.cpp` file and wouldn't define it in the headers at all. Then, we can rename `_LIBCPP_ABI_NO_VECTOR_BASE_CLASS` to `_LIBCPP_ABI_NO_VECTOR_BASE_COMMON` and have it control only whether the library provides those symbols.
> > 
> > I'd like to have the logic above counter checked, @Quuxplusone maybe?
> > 
> > Also, the same logic should apply to `std::basic_string`.
> That sounds right to me.
> So the new <vector> header would unconditionally //not// call into the dylib to throw exceptions; but we'd keep the exception-throwing functions present in the dylib (under `!defined(_LIBCPP_ABI_NO_VECTOR_BASE_COMMON)`) for the benefit of old .o files that were compiled against the previous release's <vector> header.
> 
> > Also, the same logic should apply to `std::basic_string`.
> 
> So I'd say the name of the ABI macro should be more general, and control both base class types together? Right now we have `_LIBCPP_ABI_NO_BASIC_STRING_BASE_CLASS` and `_LIBCPP_ABI_NO_VECTOR_BASE_CLASS`. We could make it `_LIBCPP_ABI_NO_COMMON_CONTAINER_BASE_EXPORTS` or something like that.
I would keep them seperate. There is no technical reason, but I think `_LIBCPP_ABI_NO_COMMON_CONTAINER_BASE_EXPORTS` sounds too much like 'there are no more common base classes in containers'. That might be true, but I didn't check and don't want to guarantee it. There might be an actual use case somewhere, and having two macros doesn't really hurt anything.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117108/new/

https://reviews.llvm.org/D117108



More information about the libcxx-commits mailing list