[PATCH] D98477: [ADT] Add IntrusiveVariant

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 24 16:42:16 PDT 2021


scott.linder added inline comments.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:12
+// union members to store the runtime tag at the beginning of the variant
+// alternative itself, allowing for it to be packed more efficiently into bits
+// that would otherwise be used for padding.
----------------
whisperity wrote:
> > Itself
> Who? The `IntrusiveVariant` or the union members?
I tried to reword this a little to clarify, let me know if it helps or if you still see room for improvement.

The "alternative types" wording is lifted from `std::variant`, and that's what I was attempting to refer to here.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:35-36
+//  * The set of alternative types must be unique, and cannot be referred
+//    to by index (although internally an index is materialized to support
+//    efficient dispatch in llvm::visit).
+//  * There is a static limit on the number of alternative types supported,
----------------
whisperity wrote:
> I think the implementation detail in the parens could go somewhere else. If this little bit of trivia shouldn't interest the average client library, it shouldn't be placed into an obvious position.
Good point, I think it can simply be deleted as the comments in `detail` should be sufficient to capture this. I also changed the phrasing "the set of alternative types must be unique" as I don't think it really says what I intended it to. Let me know what you think.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:37
+//    efficient dispatch in llvm::visit).
+//  * There is a static limit on the number of alternative types supported,
+//    as the dispatch in llvm::Visit is switch-based. The idea is that the
----------------
whisperity wrote:
> How much is the limit. Where is it stored? Might need a mention of the actual value (or symbol name) here.
I added `static constexpr` members to `IntrusiveVariant` to describe the limits, and just referred to them everywhere the limits are mentioned.

The only place where this breaks down is in the macro to generate the actual `switch` in the implementation of `llvm::visit` and similar, but there are already other similar caveats and I see no way to overcome them without ditching the approach entirely.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:54
+//  public:
+//    AltInt(int Int) : Int(Int) {}
+//    int getInt() const { return Int; }
----------------
whisperity wrote:
> Is it possible to suggest here writing an implicit conversion operator (`operator int() const { return Int; }`) and an assignment operator (`AltInt& operator=(int Int)`) to the wrapped type? Writing
> 
> ```
> if (V.holdsAlternative<AltDouble>())
> {
>   AltDouble AD = V.get<AltDouble>();
>   takes_a_double(AD.getDouble());
>   AD.setDouble(8.f);
> }
> ```
> 
> is just incredibly superfluous when `std::variant` does this already with the `get<>` and the assignment operator.
> 
> I'm more concerned about the assignment operator. Can we assign a (mutable) reference to the variant's alternative?
> Is it possible to suggest here writing an implicit conversion operator (`operator int() const { return Int; }`) and an assignment operator (`AltInt& operator=(int Int)`) to the wrapped type? Writing
> 

I think in the context of this (admittedly over-simplified) example that makes perfect sense, but the moment you have anything more complex than a wrapper around another type it is not as useful anymore. For example I don't see any opportunity to simplify the following with implicit conversions:

```
if (V.holdsAlternative<Frobinator>()) {
  reticulate(V.get<Frobinator>().getSplines());
  V.get<Frobinator>().setWidget(getCachedWidgets());
}
```

> ```
> if (V.holdsAlternative<AltDouble>())
> {
>   AltDouble AD = V.get<AltDouble>();
>   takes_a_double(AD.getDouble());
>   AD.setDouble(8.f);
> }
> ```
> 
> is just incredibly superfluous when `std::variant` does this already with the `get<>` and the assignment operator.
> 
> I'm more concerned about the assignment operator. Can we assign a (mutable) reference to the variant's alternative?

Yes there is a non-const overload for `IntrusiveVariant::get` that you can assign into if the alternative type provides the appropriate copy/move assignment operator.





================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:59-67
+//  class AltDouble {
+//    DECLARE_INTRUSIVE_ALTERNATIVE
+//    double Double;
+//
+//  public:
+//    AltDouble(double Double) : Double(Double) {}
+//    double getDouble() const { return Double; }
----------------
whisperity wrote:
> I honestly believe for usability perspectives we should offer the user the option not to write all this out by hand. Just a simple wrapper over a double is 9 lines (or 11 if we include the assignment and implicit conversion from the suggestion above).
> 
> This is just waaaaaay too much. If we have like 2 or 3 variations in the variant, we are not much better (if any) than writing out the enum+union ourselves.
> 
> At least please consider providing a simple macro that wraps any alternate type into such a variant-capable thing.
> 
> For example, I'd like to use a variant like as follows. This is how one would use `(std|boost)::variant`, right?
> 
> ```
> struct Something { /* 3 data members */ };
> struct SomeOtherThing { /* 3 other data members of different type! */ };
> 
> 
> Variant<Something, SomeOtherThing> ThisOrThat;
> ```
> 
> Now I understand this doesn't work in your case directly. The whole intrusiveness would mean that I have to deal with either making the initial data member private, and then what happens to my other data members that should (or, must!) be public instead?
> 
> As a user, I am happy to put up with having to write two additional lines, i.e.
> 
> ```
> struct Something { /* 3 data members */ };
> struct SomeOtherThing { /* 3 other data members of different type! */ };
> 
> MAKE_VARIANT_ALTERNATIVE_TYPE(Something);
> MAKE_VARIANT_ALTERNATIVE_TYPE(SomeOtherThing);
> 
> Variant<AltSomething, AltSomeOtherThing> ThisOrThat;
> //      ^~~           ^~~  Might be tricky to figure out!
> ```
> 
> But the idea is that //some// of the boilerplate required to make my types (or wrappers thereof) compatible with a working variant (which I love the idea of, by the way!) should not require me to write almost the same insane amount of code as I would have to with a conventional tagged union.
> 
I agree there is a good chunk of boilerplate, but AFAICT there is no clean way to avoid this, and the real utility of the new type is in providing a type-safe, modern interface to a tagged union which is squeezed into the smallest possible memory footprint. For example instead of:

```
switch (MyTaggedUnion.getKind()) {
case MyTaggedUnion::K_int:
  // use MyTaggedUnion.getInt(), don't accidentally use any other accessors...
  break;
case MyTaggedUnion::K_float:
  // use MyTaggedUnion.getFloat(), don't accidentally use any other accessors...
  break;
case MyTaggedUnion::K_IntPair:
  // use MyTaggedUnion.getFirstInt(), don't accidentally use any other accessors...
  // use MyTaggedUnion.getSecondInt(), don't accidentally use any other accessors...
  break;
}
```

You can write:

```
visit(makeVisitor(
         [](AltInt I) { /* use I.getInt() */},
         [](AltFloat) { /* use I.getFloat() */},
         [](AltIntPair IP) { /* use IP.getFirstInt() and IP.getSecondInt() */}
       ), MyVariant);
```

And at several points things which would be runtime UB in the hand-written tagged-union case are now compile-time errors.


In trying to work through providing a simpler interface I run into problems with each approach. If you are just naively wrapping another type, then in general you are losing out on the optimization the `IntrusiveVariant` was designed around, which is to get more optimal packing when the "wrapped" type has spare padding that the tag could live in. Consider for example:

```
struct Inner { char X; int Y; };
struct Outer { char C; Inner I; };
struct Flat { char C; char X; int Y; };
static_assert(sizeof(Flat) < sizeof(Outer), "");
```

So the only way I see to implement something along the lines of what you suggest is to have a handful of macros to "begin", "add_member", and "end" an alternative:

```
BEGIN_VARIANT_ALTERNATIVE_TYPE(AltSomething)
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeA, A);
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeB, B);
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeC, C);
END_VARIANT_ALTERNATIVE_TYPE()
```

Where the above would expand to a class definition.

This seems less attractive than just having some boilerplate, and it doesn't cover the reasonable cases where a class doesn't want to expose all of its members uniformly.


================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:207-208
+  }
+  template <std::size_t I, typename... ArgTs>
+  UnionImpl(InPlaceIndex<I>, ArgTs &&...Args) {
+    new (&getMember(InPlaceIndex<I>{}))
----------------
whisperity wrote:
> There is a bit too much template "magic" to wrap my head around so quickly, but I have a question: does this implementation allow non-default-constructible alternative types? Or any chance that it forbids default-constructible ones? What's the stance on that. The examples in the leading documentation seem to indicate that all alternatives are non-default-ctorable... However, the "formal" requirements only say they must be trivially-dtorable and standard layout. What's the stance on default constructibleness?
> There is a bit too much template "magic" to wrap my head around so quickly

You and me both :^) I tried to skip as many features as seemed reasonable in an effort to keep the amount of template/preprocessor magic to the bare minimum, but the result is still pretty dense :(

> does this implementation allow non-default-constructible alternative types? Or any chance that it forbids default-constructible ones? What's the stance on that.

This supports both default-constructible and non-default-constructible alternative types. The possibilities are:

The type pack `ArgTs` is empty, in which case it is essentially equivalent to:

```
template <std::size_t I>
UnionImpl(InPlaceIndex<I>) {
```

Which will result in default-constructing the appropriate union member if it is default-constructible, otherwise it will result in a compile-time error indicating overload resolution could not find an appropriate constructor.

Or, the type pack is not empty, and we forward whatever arguments along to the appropriate union member's non-default constructor.


> The examples in the leading documentation seem to indicate that all alternatives are non-default-ctorable

I can add an example of a type that is default-constructible, but there is explicit documentation as part of `IntrusiveVariant` itself that an `IntrusiveVariant` is default-constructible iff the first alternative type is:

```
/// A default constructed IntrusiveVariant holds a default constructed value
/// of its first alternative. Only enabled if the first alternative has a
/// default constructor.
template <int B = std::is_default_constructible<detail::NthType<0, Ts...>>{},
           typename std::enable_if_t<B, int> = 0>
constexpr IntrusiveVariant() : Union(detail::InPlaceIndex<0>{}) {}
```

And otherwise the default constructors are just constructors with no arguments, and you can `emplace` using any alternative type constructor.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98477/new/

https://reviews.llvm.org/D98477



More information about the llvm-commits mailing list