[libcxx-commits] [libcxx] [libc++] Fix UB in <expected> related to "has value" flag (#68552) (PR #68733)

via libcxx-commits libcxx-commits at lists.llvm.org
Sun Oct 15 06:05:24 PDT 2023


Jan =?utf-8?q?Kokemüller?= <jan.kokemueller at gmail.com>,
Jan =?utf-8?q?Kokemüller?= <jan.kokemueller at gmail.com>,
Jan =?utf-8?q?Kokemüller?= <jan.kokemueller at gmail.com>,
Jan =?utf-8?q?Kokemüller?= <jan.kokemueller at gmail.com>
Message-ID:
In-Reply-To: <llvm/llvm-project/pull/68733/libcxx at github.com>


philnik777 wrote:

> > > I think we can still put at least `__has_val_` into the padding bits, since we control that. It should be good enough to add a `char __padding_[sizeof(expected) - __libcpp_datasizeof<expected>::value];` to avoid accidental overwrites of user data.
> > 
> > 
> > Ah, so then `__has_val_` would always be the last byte of the `expected`, so `expected` _itself_ would never have any tail padding? I think this is a great idea!
> > As an alternative, I wonder if the current ABI is salvageable by saving/restoring tail padding in all "dangerous" functions (i.e. swap, emplace, assign, etc.) with something like:
> > ```
> >   struct __tail_padding_saver {
> >     static constexpr auto __padding_offset = __libcpp_datasizeof<expected>::value;
> >     static constexpr auto __padding_size = sizeof(expected) - __padding_offset;
> > 
> >     expected *__e;
> >     unsigned char __pad[__padding_size];
> > 
> >     explicit __tail_padding_saver(expected *__e) : __e(__e) {
> >       __builtin_memcpy(__pad, reinterpret_cast<unsigned char *>(__e) + __padding_offset, __padding_size);
> >     }
> > 
> >     ~__tail_padding_saver() {
> >       __builtin_memcpy(reinterpret_cast<unsigned char *>(__e) + __padding_offset, __pad, __padding_size);
> >     }
> >   };
> > ```
> > 
> > 
> > Or even have some helper functions
> > ```
> > construct_at_until(std::addressof(__union_.__val_), &__has_val_, std::forward<_Args>(__args)...);
> > destroy_at_until(std::addressof(__union_.__val_), &__has_val_, std::forward<_Args>(__args)...);
> > ```
> > 
> > 
> > ...that save/restore all needed bytes "under the hood"?
> 
> I think we are simply talking about
> 
> ```
> template <class T, class E>
> class expected {
> [[no_unique_address]] union_t union_;
> [[no_unique_address]] bool has_value_;
> char padding_[padding_size];
> };
> ```
> 
> so that `has_value_` can still goes into the tailing padding of `T` and also never going to write over beyond `expected` if the user has
> 
> ```
> struct User {
>   [[no_unique_address]] expected<T, E> _x;
>   [[no_unique_address]] bool _y;
> };
> ```
> 
> I think this fixes the problem but break the ABI. Another point is that if we are going to break the ABI anyway, maybe another option is to get rid of `no_unique_address` and does what exactly stated in the spec. (The spec uses anonymous union and we used named union (in order to have a type to apply `no_unique_address` ).

We could get rid of `[[no_unique_address]]` completely, but IMO it's a pretty nice property that `expected<optional<int>, int>` or similar constructs still fit in a register. I would expect that `std::expected` is used mostly for return types, so being able to squeeze our flag byte into the stored object would be really good.

https://github.com/llvm/llvm-project/pull/68733


More information about the libcxx-commits mailing list