[PATCH] D83088: Introduce CfgTraits abstraction

Fri Aug 21 12:58:22 PDT 2020

dblaikie added a comment.

(side note: this code review is a bit hard to follow with all the linting messages about naming - might be a bit more readable if it conformed to the naming conventions?)

In D83088#2227151 <https://reviews.llvm.org/D83088#2227151>, @nhaehnle wrote:

> In D83088#2225415 <https://reviews.llvm.org/D83088#2225415>, @dblaikie wrote:
>
>>>> But I guess coming back to the original/broader design: What problems is this intended to solve? The inability to write non-template algorithms over graphs? What cost does that come with? Are there algorithms that are a bit too complicated/unwieldy when done as templates? 
>>>> If it's specifically the static/dynamic dispatch issue - I'm not sure the type erasure and runtime overhead may be worth the tradeoff here, though if it is - it'd be good to keep the non-dynamic version common, rather than now having GraphTraits and CfgTraits done a bit differently, etc.
>>>
>>> It's not just over graphs, but taking SSA values into account as well -- that is the key distinction between GraphTraits and CfgTraits.
>>
>> Not sure I follow - could you give an example of a graph where the GraphTraits concept of the Graph and the CfgTraits concept of the graph (or, perhaps more importantly - features of the graph/API surface area/properties you can expose through the CFG API/concept/thing but not through GraphTraits?
>
> See below.
>
>>> The most immediate problem is divergence analysis, which is extremely complex and difficult to get right. If I had tried to fight the accidental complexity that comes with attempting to write such an algorithm as C++ templates in addition to the inherent complexity of the algorithm at the same time, I'm not sure I would have been able to produce anything workable at all.
>>>
>>> Frankly, I suspect that our dominator tree implementation also suffer because of this, though at least dominator trees are much more well studied in the academic literature, so that helps keep the inherent complexity under control.
>>
>> I'm totally open to discussing making APIs more usable, for sure - though I'm thinking it's likely a concept (like containers in the C++ standard library) might be the better direction.
>>
>> Perhaps some code samples showing how one would interact (probably not whole algorithms - maybe something simple like generating a dot diagram for a graph) with these things given different APIs (traits, concepts, and runtime polymorphism) - and implementations of each kind too.
>
> Take a look here for example: https://github.com/nhaehnle/llvm-project/blob/715450fa7f968ceefaf9c3b04b47066866c97206/llvm/lib/Analysis/GenericConvergenceUtils.cpp#L499 -- this is obviously still fairly simple, but it's an example of printing out the results of an analysis in a way that's generic over the underlying CFG and SSA form.

I'm having trouble following this example - I'm not sure what the CfgPrinter abstraction is/why it's first-class, and why this "print" function is calling what look like mutation operations like "appendBlocks". I guess perhaps the question is - what's it printing from and what's it printing to?

Ah, I see, the "append" functions are accessors, of a sort. Returning a container might be more clear than using an out parameter - alternatively, a functor parameter (ala std::for_each) that is called for each element, that can then be used to populate an existing container if desired, or to do immediate processing without the need for an intermediate container.

Though the printer abstraction still strikes me as a bit strange - especially since it doesn't seem to be printing itself. This function was passed a printer and a stream - the printer prints to the stream (perhaps it'd make more sense for the printer to take the stream on construction) and the function isn't passed the thing to print at all - that thing is accessed from the printer. That seems fairly awkward to me - I'd expect a printing operation to take a thing to be printed and a thing to print to.

Perhaps setting aside the complexities of printing things - could you provide an example of code, given a CfgGraph, that walks the graph - perhaps just numbering the nodes/edges/etc to produce a dot graph? Showing what the code would look like if it were passed a GraphTraits-implementing graph, a static polymorphic CfgGraph, and a dynamically polymorphic GfgGraph - and also showing what would be fandemantally possible with the CfgGraph that wouldn't be possible with GraphTraits, if any such things exist (it's still unclear to me whether CfgGraph has abstractions that don't exist in GraphTraits (eg: could you write a CfgGraph over GraphTraits? or would that be impossible because GraphTraits is missing concepts/CfgGraph doesn't apply to all GraphTraits-grahs? what subset of GraphTraits graphs does CfgGraph cover?).

> A statically polymorphic wrapper is here: https://github.com/nhaehnle/llvm-project/blob/715450fa7f968ceefaf9c3b04b47066866c97206/llvm/include/llvm/Analysis/GenericConvergenceUtils.h#L569
>
> The simple example might be bearable writing as a template, precisely because it's simple -- so only looking at simple examples is unlikely to really capture the motivation. Really what the motivation boils down to is stuff like this: https://github.com/nhaehnle/llvm-project/blob/controlflow-wip-v7/llvm/lib/Analysis/GenericUniformAnalysis.cpp -- I don't fancy writing all this as a template.
>
> Thid motivation would essentially go away if C++ could type-check against traits in the way that Rust and other languages like it can -- but it can't, so here we are.

I hesitate to write code that's more idiomatic in a language that isn't C++. Agreed that long/complicated algorithms as templates aren't the best thing - but sometimes can be quite suitable/idiomatic C++ (see the C++ standard library).

That said, I'd like to help make things more usable, for sure - but I'm not sure/currently feeling like this might not be the best direction for achieving that goal & I think some clear comparisons - even for overly simplistic code, where the overhead of a more complex solution may not be felt as acutely (actually, small examples might show syntactic overhead more acutely - if it takes more code to do the same thing, when that code isn't washed out by a lot of code that would be the same regardless of implementation, it will hopefully be more obvious, rather than less), hopefully it'll be more a more clear/concrete basis on which to discuss relative design tradeoffs.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83088/new/

https://reviews.llvm.org/D83088