[PATCH] D83088: Introduce CfgTraits abstraction

Mon Aug 17 15:12:26 PDT 2020

dblaikie added a comment.

In D83088#2218559 <https://reviews.llvm.org/D83088#2218559>, @nhaehnle wrote:

> In D83088#2213886 <https://reviews.llvm.org/D83088#2213886>, @dblaikie wrote:
>
>> In D83088#2213864 <https://reviews.llvm.org/D83088#2213864>, @nhaehnle wrote:
>>
>>> In D83088#2213802 <https://reviews.llvm.org/D83088#2213802>, @dblaikie wrote:
>>>
>>>> In D83088#2213797 <https://reviews.llvm.org/D83088#2213797>, @nhaehnle wrote:
>>>>
>>>>> In D83088#2208611 <https://reviews.llvm.org/D83088#2208611>, @dblaikie wrote:
>>>>>
>>>>>> This seems like a strange hybrid between a static-polymorphism (with traits) and dynamic polymorphism (with base classes/virtual functions). Could this more readily be just one or the other? (sounds like you're leaning towards dynamic polymorphism)
>>>>>
>>>>> No, it's very much this way on purpose. The idea is to support the same set of functionality as much as possible in both static **and** dynamic polymorphism.
>>>>
>>>> Could it be implemented statically as a primary interface, with a dynamic wrapper? (eg: a base class, then a derived class template that takes the static CFG type to wrap into the dynamic type) keeping the two concepts more clearly separated?
>>>
>>> That is how it is implemented. CfgTraits is the primary static interface, and then CfgInterface / CfgInterfaceImpl is the dynamic wrapper.
>>
>> Ah, fair enough. The inheritance details in the traits class confused me a bit/I had a hard time following, with all the features being in the one patch. Might be easier separately, but not sure.
>>
>> Would it be possible for this not to use traits - I know @asbirlea and I had trouble with some things using GraphTraits owing to the traits API. An alternative would be to describe a CFGGraph concept (same as a standard container concept, for instance) - where there is a concrete graph object and that object is queried for things like nodes, edges, etc. (actually one of the significant things we tripped over was the API choice to navigate edges from a node itself without any extra state - which meant nodes/edge iteration had to carry state (potentially pointers back to the graph, etc) to be able to manifest their edges - trait or concept could both address this by, for traits, passing the graph as well as the node when querying the trait for edges, or for a concept passing the node back to the graph to query for edges).
>
> So there is a bit of a part here where I may admittedly be a bit confused with the C++ lingo, since I don't actually like template programming that much :)

Not sure that's the best place to be designing this fairly integral and complicated piece of infrastructure from, but hoping we can find some good places/solutions/etc.

> (Which is part of the motivation for this to begin with... so that I can do the later changes in the stack here without *everything* being in templates.)

That concerns me a bit as a motivation - Perhaps the existing GraphTraits template approach could be improved, rather than adding another/separate set of complexity with both dynamic and static dispatch. (eg: containers in the C++ standard library don't support runtime polymorphism (you can't dynamically dispatch over a std::vector versus a std::list, for instance)).

What does/will this Cfg abstraction provide that's separate from the current Graph (provided by GraphTraits) abstraction? Does it provide things other than the ability to write these algorithms as non-templates? (in which case is the non-dynamic portion of this functionally equivalent to GraphTraits (but more as a concept than a trait, by the sounds of it))

> The way the `CfgTraits` is used is that you never use the `CfgTraits` class directly except to inherit from it using CRTP (curiously recurring template pattern).

side note: Using the same name in multiple namespaces makes this a bit harder to read than it might otherwise be (clang::CfgTraits deriving from llvm::CfgTraits, etc)
So currently you write a MyCfgTraitsBase, deriving from llvm::CfgTraitsBase

  class MyCfgTraitsBase : public llvm::CfgTraitsBase { ...

then you write CfgTraits that derieves from that with both CRTP and the MyCfgTraitsBase

  class MyCfgTraits : public llvm::CfgTraits<CfgTraitsBase, CfgTraits>

Could this be simplified by moving the MyCfgTraitsBase stuff into MyCfgTraits, and having llvm::CfgTraits with just one template parameter, the derived class?

> When writing algorithms that want to be generic over the type of CFG, those algorithms then have a derived class of CfgTraits as a template parameter. For example, D83094 <https://reviews.llvm.org/D83094> adds a `GenericCycleInfo<CfgTraitsT>` template class, where the template parameter should be set to e.g. `IrCfgTraits`, if you want cycle info on LLVM IR, or to `MachineCfgTraits`, if you want cycle info on MachineIR. Both of these classes are derived from `CfgTraits`.

Why is it necessary to pass the traits, rather than looking it up via a specialization (or allowing it to be passed explicitly if the user wants to)?

> It is definitely different from how `GraphTraits` works, which you use it as `GraphTraits<NodeType>`, and then `GraphTraits<BasicBlock *>` etc. are specialized implementations. If `GraphTraits` worked the way that `CfgTraits` works, then we'd instead have classes like `BasicBlockGraphTraits`.
>
> So to sum it up, all this sounds a bit to me like maybe calling `CfgTraits` "traits" is wrong? Is that what you're saying here?

Hmm, don't think so - as I look at it more. It's still seems like a traits class - it has all static members, a bunch of typedefs. And those members/types are used to interact with/probe some other object. The fact you can't look up the traits of a given type T certainly make this a bit quirky/outside the more usual model.

> You can't just call it `Cfg` though, because it's *not* a CFG -- it's a kind of interface to a CFG which is designed for static polymorphism, unlike `CfgInterface` which is designed for dynamic polymorphism. Getting the names right is important, unfortunately I admit that I'm a bit lost there. "Traits" seemed like the closest thing to what I want, but I'm definitely open to suggestions.

Let's take a specific example then - Clang's CFG and LLVM's IR CFG. What if both those classes had a common API using exactly the same identifiers, typedefs, etc? That's what I mean by a non-traits-based solution. Much like std::vector and std::list have the same API (you can iterate over them using the same functions, etc - yes, only in other templates).

But I guess coming back to the original/broader design: What problems is this intended to solve? The inability to write non-template algorithms over graphs? What cost does that come with? Are there algorithms that are a bit too complicated/unwieldy when done as templates? 
If it's specifically the static/dynamic dispatch issue - I'm not sure the type erasure and runtime overhead may be worth the tradeoff here, though if it is - it'd be good to keep the non-dynamic version common, rather than now having GraphTraits and CfgTraits done a bit differently, etc.

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:51
+
+  operator bool() const { return ptr != nullptr; }
+
----------------
`operator bool` should be `explicit`

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:53-54
+
+  bool operator==(CfgOpaqueType other) const { return ptr == other.ptr; }
+  bool operator!=(CfgOpaqueType other) const { return ptr != other.ptr; }
+};
----------------
Preferably make any operator overload that can be a non-member, a non-member - this ensures equal conversion handling on both the left and right hand side of symmetric operators like these. (they can be friends if needed, but doesn't look like it in this case - non-friend, non-members that call get() should be fine here)

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:90
+/// operations such as traversal of the CFG.
+class CfgTraitsBase {
+protected:
----------------
Not sure if this benefits from being inherited from, versus being freely accessible?

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:271-273
+template <typename CfgRelatedTypeT> struct CfgTraitsFor {
+  // using CfgTraits = ...;
+};
----------------
This probably shouldn't be defined if it's only needed for specialization,  instead it can be declared:
```
template<typename CfgRelatedTypeT> struct CfgTraitsFor;
```

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:287
+public:
+  virtual ~CfgInterface() {}
+
----------------
prefer `= default` where possible

================
Comment at: llvm/include/llvm/Support/CfgTraits.h:337
+    return Printable(
+        [this, block](raw_ostream &out) { printBlockName(out, block); });
+  }
----------------
generally capture everything by ref `[&]` if the lambda is only used locally/within the same expression or block

================
Comment at: mlir/include/mlir/IR/Dominance.h:33-34
+
+class CfgTraits : public llvm::CfgTraits<CfgTraitsBase, CfgTraits> {
+public:
+  static Region *getBlockParent(Block *block) { return block->getParent(); }
----------------
if something inherits publicly and declares all members public, I'd usually use "struct" and omit the "public"s.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83088/new/

https://reviews.llvm.org/D83088