[PATCH] D98477: [ADT] Add IntrusiveVariant

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 2 15:42:12 PDT 2021


scott.linder added inline comments.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:159-166
+/// A helper macro to add the declarations needed to use a type as an
+/// alternative for IntrusiveVariant. Must be the first declaration of the
+/// class.
+#define DECLARE_INTRUSIVE_ALTERNATIVE                                          \
+  ::llvm::IntrusiveVariantTag IntrusiveVariantTagMember;                       \
+  template <typename...> friend class ::llvm::IntrusiveVariant;                \
+  template <std::size_t, typename, typename...>                                \
----------------
dblaikie wrote:
> Can we instead use/generalize something like the type traits used for PointerUnion that declare the number of spare bits in the type? Since we already have those in LLVM (& to macro-y).
My understanding of the `PointerUnion` case is that the standard allows the bitwise manipulation of pointer representations so long as you jump through the appropriate casting hoops (i.e. to `(u)intptr_t` and back) and ensure you end up with the same representation before casting back. For objects of arbitrary standard-layout type I don't know that we have any neat way to ensure we don't invoke UB, as the "padding" bits in the object representation are not guaranteed to be preserved over most uses of the struct, AFAIK.

That said, if I'm just missing something I would be very happy to remove `DECLARE_INTRUSIVE_ALTERNATIVE`. I currently see it as an acceptable concession needed to meet the requirements of the "common initial sequence" rule, which seems like the only way to get the compact layout without invoking UB.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:276-280
+// case limbs still have to resolve to some overload, but we won't have an
+// overload available for the cases where the index is too large (i.e. we want
+// there to be no such overload, so we get compile-time errors for those cases
+// everywhere else). As a workaround, we make the body of the out-of-bounds
+// overloads be llvm_unreachable.
----------------
dblaikie wrote:
> Not sure I follow why this ends up as a runtime error - what if the overload with unreachable (down at 307) wasn't defined? Wouldn't that produce a compile-time error only when the index is out of bounds?
AFAICT the only way to get that behavior is to have a separate overload for every size of `IntrusiveVariant` we support, as there is no way to recursively generate the switch itself, and if we just declare a switch with some arbitrary number of `case` limbs (in my patch this is 32) then it is a compile-time error to use any `IntrusiveVariant` with fewer than 32 alternatives.

In the simplest case:

```
visit(Callable C, IntrusiveVariant<A> IV) { // i.e. toy example of a monomorphized visit for a variant with a single alternative type A
  switch (IV.index()) {
  case 0: C(IV.Union.getMember(InPlaceIndex{0})); break;
  case 1: C(IV.Union.getMember(InPlaceIndex{1})); break; // This does not resolve to any overload of getMember, and is a compile-time error.
  }
}
```

In the above, if we only wrote one template for `visit` and it has two `case` limbs in the `switch`, then it will only compile for variants with exactly two alternative types. If we could recursively generate the body of `visit` we could do this without the `unreachable` workaround, but I don't see how to.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:327-343
+// Macro for writing the 32-case switch statement from a definition of CASE(N)
+#define CASES8(B)                                                              \
+  CASE((B + 0))                                                                \
+  CASE((B + 1))                                                                \
+  CASE((B + 2))                                                                \
+  CASE((B + 3))                                                                \
+  CASE((B + 4))                                                                \
----------------
dblaikie wrote:
> There's a fair bit of macro expansion stuff here - any chance it can be restructured to use variadic templates to generalize better/reduce the macro usage?
I think this is getting at the same issue as above, and I don't see any way to generate a `switch` with a compile-time determined number of `case` limbs using variadic templates :/


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:429-432
+  union {
+    detail::CommonInitialSequenceT CommonInitialSequence;
+    UnionT Union;
+  };
----------------
dblaikie wrote:
> Technically C++ doesn't allow access to a common initial sequence except for types with standard layout, I think? So types with non-trivial copy ops, etc, wouldn't be usable here. (I wonder what PointerUnion does? I guess it just stuffs everything in void* or uintptr_t, which isn't exactly applicable here - but maybe some aligned storage would be suitable here)
Yes, that's my understanding too, and I do `static_assert` the `standard-layout`-ness of all alternative types and mention this restriction in the file header comment.

> maybe some aligned storage would be suitable here

I think this is the point I may be missing up above, but if I understand what you're getting at the idea would be to basically:

* Allocate an array of `char` with the maximum alignment of any of the alternative types
* If possible, put our index tag in any padding left over after the value representation of all the alternative types (as I mentioned above, this is the part I'm unsure about concerning UB)
* Manage all of the construction, copying, moving, destruction of alternative types in the array of `char`

If there is a way to do this without relying on UB, then it seems reasonable to me. I do worry it would end up being a lot of code, in particular I think we would need to do something like the `libcxx` implementation of `std::variant` to support all of the possible combinations of levels of trivial-ness of construction, copying, destruction, etc. which was something I wanted to avoid doing purely to make the code simpler.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98477/new/

https://reviews.llvm.org/D98477



More information about the llvm-commits mailing list