[PATCH] D48807: Add llvm::Any

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 9 18:46:10 PDT 2018


dblaikie added a comment.

In https://reviews.llvm.org/D48807#1150245, @zturner wrote:

> Repeating my previous reply since it was through email and didn't make it onto Phab.
>
> In https://reviews.llvm.org/D48807#1150029, @dblaikie wrote:
>
> > In https://reviews.llvm.org/D48807#1149775, @zturner wrote:
> >
> > > I've wanted this for quite a long time.  My original motivation for this was back when i was working primarily in LLDB.  There's a lot of places where `void*` is passed around as a generic parameter to a user callback.  So, for example, you say `DoSomethingAsynchronously(Callback, Param)` and when the thing is finished, it invokes `Callback(Param)`.  There are obviously better ways to do this if you're designing this from the beginning, but it's not always easy to retrofit that into an existing design.  So I think several places in LLDB could be cleaned up by replacing some `void*` with `Any`.  Note those places still exist today, so `Any` could be useful there right now.
> >
> >
> > Would it be expensive to reimplement those APIs using stateful functors while providing the backwards compatibility with something like:
> >
> >   void register(void (*f)(void*), void *v) { register([f, v] { f(v); }); }
> >   
> >
> > Or is there something in the implementation (rather than the existing usage) that makes migrating difficult?
>
>
> It’s been a while since I was close to any of this code so it’s hard to say
>
> > 
> > 
> >> More recently while designing a basic API for a tracing library, I've gravitated towards a design where the library generates events (represented by some structure) depending on what happens.  But since tracing is heavily platform dependent, there's only so much of this that you can abstract away into a common structure and there will be some additional pieces that only make sense on the particular platform.  So you call:
> >> 
> >>   unique_ptr<EventBase> Event = llvm::trace::WaitForEvent();
> >>   // Do something depending on what kind of event it is.
> >>   llvm::trace::ContinueFromEvent(std::move(Event));
> >> 
> >> 
> >> In order for every possible platform to implement this correctly, there may need to be some platform-specific state maintained between the call to Wait and the call to Continue.  Linux might need to pass the exact signal value and parameters.  Windows might need to pass the exact `EXCEPTION_RECORD` structure <https://docs.microsoft.com/en-us/windows/desktop/api/winnt/ns-winnt-_exception_record> that the OS filled out for us right before we return from `WaitForEvent()`.  It's useful to be able to put that into an `Any`, so that we have something like:
> >> 
> >>   class EventBase {
> >>     TraceEventType Type;
> >>     uint64_t ThreadId;
> >>     uint64_t ProcessId;
> >>     Any PlatformPiece;
> >>   };
> >>   
> >>   class ThreadCreatedEvent : public EventBase {
> >>     void *EntryPoint;
> >>     void *TlsArea;
> >>     // Etc.
> >>   };
> >>   
> >>   class BreakpointEvent : public EventBase {
> >>     uint64_t Address;
> >>   };
> >> 
> >> 
> >> and so forth and so on.  When creating this struct before returning from `WaitForEvent()`, the library can set `PlatformPiece` to some structure that would contain any platform-specific state needed to properly continue.  I prefer not to use inheritance and `llvm::cast` here (e.g. `WindowsBreakpointEvent`, `PosixBreakpointEvent`, etc) because it just leads to inheritance hierarchy explosion and ultimately it's no more type-safe than `any` (someone still has to make an assumption about exactly what type it is).
> > 
> > If this is host-platform specific, why would this need to be runtime variable at all? Wouldn't the PlatformPiece be hardcoded to some specific layout based on the host platform the compiler was built for/running on?
>
> Not necessarily.  For example you may have an x64 build of llvm tracing a 32 bit process.


Ah, good point.

>   This can arise on Windows (wow64) as well as MachO (universal binaries).  Maybe other platforms as well.  You could probably narrow it to a variant, but we don’t have variant in llvm either and it’s much harder to implement and im not sure it buys much.

I'd say variant buys a fair bit in terms of type safety/self-documenting code. I worry about both variant and any in terms of what they do to type safety/contracts between different pieces of code, hence my pushback on adding these kinds of constructs to LLVM.

> Also, in a system where events are passed around and potentially held onto by user code for a long time, it’s useful to have a generic “Tag” field that the user can attach arbitrary data to. This is another potential use case which I didn’t mention before.

I'm still having a little trouble understanding what code is going to consume these events (& how's it going to know/support the different kinds of things in them) given how arbitrary (no common interface, etc) they may be?

>   these 2-3 use cases aside, given that it’s in c++17, seems like filling as many of the missing gaps as possible between our support libraries and c++17 can only be a good thing

If we were working with C++17 I'd push back in the same way against adding usage of these types to LLVM code for the same concerns (type safety, self documenting code, etc). Much like LLVM not using exceptions, iostreams, etc - not everything in the language is suitable for every codebase.



================
Comment at: llvm/include/llvm/ADT/Any.h:25
+
+template <typename T> struct TypeId { static const char Id = 0; };
+
----------------
I think an out of line definition of Id is necessary here, maybe?

(also you can potentially do `static const void *Id;` + `const void *TypeId::Id = &TypeId::Id;` basically)

& I'd probably make TypeId a private implementation detail of 'Any'?


================
Comment at: llvm/include/llvm/ADT/Any.h:29
+  struct StorageBase {
+    virtual ~StorageBase() {}
+    virtual std::unique_ptr<StorageBase> clone() const = 0;
----------------
  = default


================
Comment at: llvm/include/llvm/ADT/Any.h:30
+    virtual ~StorageBase() {}
+    virtual std::unique_ptr<StorageBase> clone() const = 0;
+    virtual const void *id() const = 0;
----------------
Could probably add a cloneable CRTP helper some day - no doubt there are other hierarchies that could use that.


================
Comment at: llvm/include/llvm/ADT/Any.h:34-35
+
+  template <typename T> class StorageImpl : public StorageBase {
+  public:
+    explicit StorageImpl(const T &Value) : Value(Value) {}
----------------
Feels like you maybe might as well use the 'struct' keyword here too - since everything's public, except the deleted operators (which don't particularly benefit from being private anyway - so probably make them public too)


================
Comment at: llvm/include/llvm/ADT/Any.h:54
+public:
+  Any() {}
+
----------------
  = default


================
Comment at: llvm/include/llvm/ADT/Any.h:76-85
+  Any &operator=(const Any &Other) {
+    Any(Other).swap(*this);
+    return *this;
+  }
+
+  Any &operator=(Any &&Other) {
+    Storage = std::move(Other.Storage);
----------------
You could collapse these into one operator that takes Any by value if you like.

  Any &operator=(Any Other) {
    Storage = std::move(Other.Storage);
    return *this;
  }


================
Comment at: llvm/include/llvm/ADT/Any.h:91-93
+  bool empty() const { return !Storage; }
+
+  void clear() { Storage.reset(); }
----------------
Might as well match std::any and have these be called has_value() and reset()?


================
Comment at: llvm/include/llvm/ADT/Any.h:108
+  using DecayedT =
+      typename std::remove_cv<typename std::decay<const T>::type>::type;
+  if (!Value.Storage)
----------------
Why add the 'const' here - if cv qualifiers are being removed anyway? Maybe I'm missing something about how std::decay/other parts of this are working?

(also std::decay says it removes cv qualifiers - so is it necessary to then explicitly remove them after that?)


================
Comment at: llvm/include/llvm/ADT/Any.h:114-135
+template <class T> T any_cast(const Any &Value) {
+  using DecayedT =
+      typename std::remove_cv<typename std::decay<const T>::type>::type;
+  assert(any_isa<T>(Value) && "Bad any cast!");
+  return static_cast<Any::StorageImpl<DecayedT> *>(Value.Storage.get())->Value;
+}
+
----------------
Implement these in terms of the pointer variants, perhaps? (to reduce duplication)


================
Comment at: llvm/include/llvm/ADT/Any.h:133-134
+  return std::move(static_cast<Any::StorageImpl<DecayedT> *>(
+                       Any(std::move(Value)).Storage.get())
+                       ->Value);
+}
----------------
Is this the right behavior according to the standard version of std::any? (should the passed in Any be left in the empty state, or in the non-empty state with a moved-from value? It looks like maybe the latter)


================
Comment at: llvm/unittests/ADT/AnyTest.cpp:120-122
+  EXPECT_EQ(7, *llvm::any_cast<int>(&A));
+  *llvm::any_cast<int>(&A) = 42;
+  EXPECT_EQ(42, *llvm::any_cast<int>(&A));
----------------
Maybe create a named local variable for the pointer here - to demonstrate that this code is valid:

  int *i = llvm::any_cast<int>(&A);
  *i = 42;
  ...

The test as-is would pass even if any_cast returned a proxy object, which I would guess we don't want to allow (& instead want to allow/guarantee that users can write code that keeps pointers, etc).


https://reviews.llvm.org/D48807





More information about the llvm-commits mailing list