[PATCH] D98477: [ADT] Add IntrusiveVariant, in_place_index, and in_place_type

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 24 09:48:53 PDT 2021


scott.linder added a comment.

In D98477#2775242 <https://reviews.llvm.org/D98477#2775242>, @dblaikie wrote:

> Could you reupload this (or maybe you can handle this just by editing the revision here on the web UI?) using arc/phab's ability to flag dependent patches/part of a series, hopefully that'll make it easier to apply the patch to test things? (tried just applying this patch and of course `no template named 'TypeAtIndex'`)

Yes, I had meant to from the start, I think it should be correct now if you want to try again



================
Comment at: llvm/include/llvm/ADT/IntrusiveVariant.h:276-280
+// case limbs still have to resolve to some overload, but we won't have an
+// overload available for the cases where the index is too large (i.e. we want
+// there to be no such overload, so we get compile-time errors for those cases
+// everywhere else). As a workaround, we make the body of the out-of-bounds
+// overloads be llvm_unreachable.
----------------
dblaikie wrote:
> scott.linder wrote:
> > dblaikie wrote:
> > > scott.linder wrote:
> > > > dblaikie wrote:
> > > > > Not sure I follow why this ends up as a runtime error - what if the overload with unreachable (down at 307) wasn't defined? Wouldn't that produce a compile-time error only when the index is out of bounds?
> > > > AFAICT the only way to get that behavior is to have a separate overload for every size of `IntrusiveVariant` we support, as there is no way to recursively generate the switch itself, and if we just declare a switch with some arbitrary number of `case` limbs (in my patch this is 32) then it is a compile-time error to use any `IntrusiveVariant` with fewer than 32 alternatives.
> > > > 
> > > > In the simplest case:
> > > > 
> > > > ```
> > > > visit(Callable C, IntrusiveVariant<A> IV) { // i.e. toy example of a monomorphized visit for a variant with a single alternative type A
> > > >   switch (IV.index()) {
> > > >   case 0: C(IV.Union.getMember(InPlaceIndex{0})); break;
> > > >   case 1: C(IV.Union.getMember(InPlaceIndex{1})); break; // This does not resolve to any overload of getMember, and is a compile-time error.
> > > >   }
> > > > }
> > > > ```
> > > > 
> > > > In the above, if we only wrote one template for `visit` and it has two `case` limbs in the `switch`, then it will only compile for variants with exactly two alternative types. If we could recursively generate the body of `visit` we could do this without the `unreachable` workaround, but I don't see how to.
> > > I've missed a step in the logic here.
> > > 
> > > I'm still trying to wrap my head around: What happens if the function defined below with llvm_unreachable was instead not defined at all? Where is the code that's creating a dead call to this code such that it needs to be defined, but is guaranteed never to be called?
> > > 
> > > Separately, as for the switches: Presumably some recursion could be used instead. Yeah, it doesn't scale well especially at -O0 codegen.
> > > 
> > > Oh, maybe an array of function pointers could be used - a function template templated on the index? Would avoid the recursion.
> > > What happens if the function defined below with llvm_unreachable was instead not defined at all?
> > 
> > In short, it is a compile-time error to use any `IntrusiveVariant` without //exactly// `MaxNumberOfAlternatives` alternative types. In effect, the compiler sees exactly one `switch` with exactly `MaxNumberOfAlternatives` non-default cases, each of which calls `FallibleIntrusiveVariantVisitor::operator()`. If the `IntrusiveVariant` has fewer than `MaxNumberOfAlternatives`, some of those calls do not resolve to any overload and the compiler complains.
> > 
> > More generally, with the approach I'm taking, when calling `visit` defined with `N` non-default `switch` cases on an `IntrusiveVariant` with `M` alternative types:
> > 
> > * If `N == M`, each alternative type corresponds to exactly one case, and exactly one meaningful overload of `FallibleIntrusiveVariantVisitor::operator()`.
> > * If `N < M`, the first `M` alternative types correspond to exactly one case, and exactly one meaningful overload of `FallibleIntrusiveVariantVisitor::operator()`. The remaining `M - N` cases correspond to the `default:` case, and we hit a runtime error. This should be a compile time error, which I manually enforce with `static_assert(sizeof...(Ts) <= MaxNumberOfAlternatives)`; this isn't particularly pretty, but with the simplest switch-based approach I don't know how to achieve anything better.
> > * If `N > M`, the first `M` alternative types correspond to exactly one case, and exactly one meaningful overload of `FallibleIntrusiveVariantVisitor::operator()`. However, there are `N - M` //additional// cases in the switch, which call "invalid" or "non-meaningful" overloads of `FallibleIntrusiveVariantVisitor::operator()`. We must provide a definition of these functions to satisfy the compiler/linker, but by construction we ensure they can never actually be called at runtime. To try to convey this to the compiler, we define them to call `llvm_unreachable`.
> > 
> > > Separately, as for the switches: Presumably some recursion could be used instead. Yeah, it doesn't scale well especially at -O0 codegen.
> > > 
> > > Oh, maybe an array of function pointers could be used - a function template templated on the index? Would avoid the recursion.
> > 
> > Both of these are definitely viable, and have different tradeoffs. I landed on switch-based because:
> > 
> > * It demands the least amount of work by the compiler to be efficient at runtime. As you mention, this matters most at -O0
> > * It demands the least amount of work by the compiler to produce a compact binary. This is (or at least was) apparently an issue in some real world cases with e.g. libc++'s array-of-function-pointer based approach, resulting in pretty significant code-bloat. It has led to multiple attempts to incorporate a `switch`-based approach into libc++'s implementation of `visit`.
> > * It seemed (definitely subjectively) like the simplest to write and read. The need for the "fallible" adapter might negate some of this, though, so I'm not so certain this is true anymore.
> > 
> > The current state of this in libc++ seems to be that the `switch`-based approach has been landed and reverted multiple times, each time the revert noting the general approach is still sound, but some issue with the exact implementation was identified and could not be fixed in a short enough window to avoid reverting. The most recent revert is https://reviews.llvm.org/D91662
> > 
> > The fix to this in my patch is to simply not implement the full generality, and have some relatively small limit on the number of alternative type supported. That way the code generated scales linearly with the number of unique `IntrusiveVariant` types instantiated, and has a pretty small constant factor (a 32-case switch where each branch is a call). At higher optimization levels I would expect all of this to be pretty routinely optimized away, but I haven't actually verified that.
> > 
> > Maybe @mpark, @ldionne, @EricWF have some guidance on what the best implementation strategy for an LLVM-specific `visit` might be?
> Yeah, would be happy to hear from the libc++ folks - though, for what it's worth I think maybe the array of function pointers might be simpler (once the patch series is updated so it's easier to apply for testing, I'll poke around with a few options) for our purposes without having to necessarily have the same priorities a highly general library like libc++ has.
That's fair, thinking about it again the fact that we are not intending to support `visit`ing multiple variants means the array will be 1-dimensional, which should make it easier to read and reduce the code-bloat in the unoptimized case. I can also draft an implementation of that


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98477/new/

https://reviews.llvm.org/D98477



More information about the llvm-commits mailing list