[PATCH] D98477: [ADT] Add IntrusiveVariant, VariantTraits, and new STLForwardCompat

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 3 12:22:50 PDT 2021


scott.linder added a comment.

In D98477#2787409 <https://reviews.llvm.org/D98477#2787409>, @dblaikie wrote:

> In D98477#2783276 <https://reviews.llvm.org/D98477#2783276>, @scott.linder wrote:
>
>> In D98477#2780702 <https://reviews.llvm.org/D98477#2780702>, @dblaikie wrote:
>>
>>> Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.
>>>
>>> (still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 <https://reviews.llvm.org/rGc6f20d70a8c93ab3d50c9777c477fe040460a664> - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)
>>
>> Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.
>
> But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

AFAIK that isn't true: we can't assume the base subobject is at the start of the layout of the derived object unless both are standard-layout. The first example I could think of where the assumption breaks down with any reasonable compiler is multiple inheritence, e.g. in https://godbolt.org/z/K9rnxbPo9 it is impossible to know where to find the base subobject for the dynamic alternative object when all we have is a pointer to whatever is providing storage for it. Put another way, we need to already know the dynamic derived type to do this, which is the thing we are after in the first place.

The example in the above godbolt is:

  #include <cstdint>
  
  struct Base { int tag; };
  struct Unrelated { int foo; };
  struct AlternativeA : Base { int bar; };
  struct AlternativeB : Unrelated, Base { int baz; };
  
  AlternativeA A;
  AlternativeB B;
  
  uintptr_t AddrA() { return (uintptr_t)(void*)&A; }
  uintptr_t AddrAtoBase() { return (uintptr_t)(void*)(Base*)&A; }
  
  uintptr_t AddrB() { return (uintptr_t)(void*)&B; }
  uintptr_t AddrBtoBase() { return (uintptr_t)(void*)(Base*)&B; }
  uintptr_t AddrBtoUnrelated() { return (uintptr_t)(void*)(Unrelated*)&B; }

Where in trunk clang `AddrB() != AddrBtoBase()`.

Maybe there are not other examples where a reasonable compiler doesn't lay out a sole base class at the start of the derived class, but even still it violates strict aliasing if we aren't talking about standard-layout types.

I also tried to look deeper into how this might be implemented using aligned storage, which you suggested earlier. As you point out, there are potential provenance issues. The pointer to the object which provides storage cannot be cast to a derived type without `std::launder` in at least some cases, although `std::launder` doesn't exist until C++17 and it isn't clear what the correct interpretation is for a C++14 compiler. Additionally, your example of `reinterpret_cast` to access the first member is only valid for standard-layout types.

The suggestion to put the tag at the end, starting after the object representation of the biggest alternative type, seems the most feasible, assuming:

- We don't need `std::launder` pre-C++17
- No reasonable compiler we support for compiling LLVM will be modifying trailing padding, i.e. bits in the object representation that are not a part of the value representation. The only thing I found with a quick search is __builtin_clear_padding which seems to be intended for use with atomics https://reviews.llvm.org/D87974?id=292987
- We can figure out some way to know where the padding in each alternative object starts. You suggested a traits-like mechanism where the alternative itself tries to occupy all the space it expects to be padding with a `char` array, and makes our implementation aware of it. I don't think I'm happy with that approach for the reasons you mention, and because it seems very non-ergonomic for the user, much more than requiring them to invoke a macro or derive from a base class.

If we can come up with a reliable way to know what padding bits are available, and are comfortable with the fact that compilers are not interested in clobbering them, then I think the `std::launder` issue seems like a minor detail we can figure out one way or another.

Do you have any thoughts on how I might go about tackling this? It would be really nice to have a cleaner interface to `IntrusiveVariant`, but I'm still just stuck on how to achieve it in a portable way.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98477/new/

https://reviews.llvm.org/D98477



More information about the llvm-commits mailing list