[libcxx-commits] [PATCH] D99624: [libc++] Remove UB in std::list

Arthur O'Dwyer via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Mar 31 12:03:20 PDT 2021

Quuxplusone added inline comments.

Comment at: libcxx/include/list:273
-    _Tp __value_;
+    union { _Tp __value_; };
ldionne wrote:
> Quuxplusone wrote:
> > I wonder whether you want to have a `char` in this union too, as the first element. (Have I been cargo-culting that idiom around? Godbolt seems to indicate that maybe the rules changed between '03 and '11?)
> > 
> > Also, of course, you never call the destructor of `_Tp` anywhere anymore, so I //really// hope some tests fail on buildkite. ;)
> I'm not aware of the `char`-in-the-union idiom, can you explain what it does?
> Also, we do call the destructor for `_Tp` in `__list_imp<_Tp, _Alloc>::clear()`. I haven't fully audited the code, but I don't think we called the destructor for `__list_node` anywhere previously, so adding the `union` doesn't change anything.
Ah, I get it. Okay.
The char-in-the-union idiom is just `union { char __dummy_; _Tp __value_; };` — put a trivial type as the first member of the union. libc++ does that in `__optional_destruct_base` and also in `__union` (in `<variant>`). But on slightly closer inspection, I actually can't figure out what the purpose of that dummy char is. I thought it was to make the union default-constructible and/or give it a non-deleted destructor, but that actually doesn't seem to be the case.
I may poke into this a bit more, but basically I don't know what the char-in-the-union idiom is for anymore! Hence, maybe I've been cargo-culting it into places where it's unnecessary.

Comment at: libcxx/include/list:1122
       return __hold_pointer(__p, __node_destructor(__na, 1));
ldionne wrote:
> Quuxplusone wrote:
> > As I understand it, there is no UB here as of C++20: this is the main point of @rsmith 's [P0593 "Implicit creation of objects for low-level object manipulation"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0593r4.html#objects-provided-as-byte-representation). When we get the memory for a node from the allocator, the allocator ultimately gets it from some operation that "implicitly creates objects." Essentially, the C++20 compiler is required to assume that `malloc` //might// be implemented as
> > ```
> > void *malloc(size_t n) {
> >     __node *p = real_malloc(n);
> >     ::new (p) __node();
> >     p->__value_.~_Tp();
> >     return p;
> > }
> > ```
> > (I mean, the compiler can't prove `malloc` //isn't// implemented that way!) So libc++'s code //is// allowed to assume that the implicitly created object was in fact created, and the compiler //isn't// allowed to optimize as-if the object wasn't created. Thus: no UB. (Compiler vendors are, technically, allowed to follow a different object model in C++03/11/14/17 mode; but I would be shocked.)
> > 
> > I describe this "compiler magic" also in ["Trivially Relocatable"](https://www.youtube.com/watch?v=SGdfPextuAU&t=2715s), using `std::list` specifically as my example.
> From P0593:
> > A class `S` is an implicit-lifetime class if it is an aggregate ([dcl.aggr]) or has at least one trivial, non-deleted constructor and a trivial, non-deleted destructor.
> `__list_node` is not an aggregate (it has a user-declared constructor). It doesn't have a trivial non-deleted constructor AFAICT either, so it's not an implicit-lifetime type. Hence, my understanding is that no object is implicitly created, and it's still UB. I could be wrong, but that's my understanding.
Hm. More likely I'm wrong.
We could easily give `__list_node` at least one trivial, non-deleted constructor, right?
I'm not sure if it's possible to give `__list_node` a trivial, non-deleted destructor. (I think this interacts with my confusion about the char-in-union thing. I don't know the union rules well.)

  rG LLVM Github Monorepo



More information about the libcxx-commits mailing list