[PATCH] D29005: [XRay] A graph Class for the llvm-xray graph

Mon Feb 6 23:40:03 PST 2017

dberris requested changes to this revision.
dberris added a comment.
This revision now requires changes to proceed.

While thinking about this more, it's going to be really hard to start understanding what the semantics we're optimising for are without unit tests. This class is sufficiently complex in implementation that I think we really ought to spend as much time as we can writing up unit tests that will exercise the parts of the code that are really important (in this case that the graph we're building is representing the graph we intend to build) and that we're able to implement the basic graph algorithms like breadth-first-search or depth-first-search on top of the interface of this graph type. The risk right now of not implementing unit tests is that any changes we make to this type's implementation will introduce subtle bugs that will break the basic use-cases while serving only the ones that the `llvm-xray graph/graph-diff` tools need.

I understand that there's some urgency here, I'd be really happy with covering basic tests covering the most basic of features of this type.

================
Comment at: include/llvm/XRay/Graph.h:41-44
+///  - VI, this is a type over which DenseMapInfo is defined and is the type
+///    used look up strings, available as VertexIdentifier.
+///  - If the built in DenseMapInfo is not defined, provide a specialization
+///    class type here.
----------------
This seems like it's leaking implementation details. Why should the user of the type need to know about DenseMapInfo?

================
Comment at: include/llvm/XRay/Graph.h:49-61
+/// Usage Example Graph with weighted edges and vertices:
+///   Graph<int, int, int> G;
+///
+///   G[1] = 0;
+///   G[2] = 2;
+///   G[{1,2}] = 1;
+///   G[{2,1}] = -1;
----------------
Some well-placed empty lines here would help readability.

```
/// Usage:
///
///    Graph<int, int, int> G;
///    G[1] = 0;
///    G[2] = 1;
///    G[{1, 2}] = 2;
///
///    for (const auto& V : G.vertices()) {
///       // V will be an int, do something with it here.
///    }
///
///    for (const auto &E : G.edges()) {
///        // E will be a std::pair<int, int>, do something with it here.
///    }
///
///    // show an example on what to do with the third int?
```

================
Comment at: include/llvm/XRay/Graph.h:62
+///   }
+///
+template <typename VertexAttribute, typename EdgeAttribute,
----------------
While you do provide documentation for the views and for some of the member operations, please provide documentation about things like:

- Space complexity/requirements.
- Complexity of operations like insertion, removal if that's supported, point-wise lookup, etc.
- What the semantics are for certain operations -- how do failures get conveyed, what are the invariants that are preserved that's useful to know from a user's perspective, etc.

================
Comment at: include/llvm/XRay/Graph.h:71-73
+  /// A child type of std::pair<VertexIdentifier, VertexAttribute> and should be
+  /// treated as such. This type is the value_type of all iterators which range
+  /// over sets of vertices.
----------------
I don't know whether saying that this is derived from std::pair<...> is important to document. This seems like an implementation detail that doesn't help the user of this class at all. Probably say what this type is useful for -- probably important to say that this represents the vertex and the attributes associated with it.

================
Comment at: include/llvm/XRay/Graph.h:77-79
+  /// EdgeValueType, a child type of std::pair<EdgeIdentifier, EdgeAttribute>,
+  /// and should be treated as such. This type the value_type of all iterators
+  /// which range over sets of edges.
----------------
Likewise here, I don't see why it's important to tell the reader that you should treat this type as a pair.

================
Comment at: include/llvm/XRay/Graph.h:80-88
+  ///
+  /// The following properties are desireable in the graph class:
+  ///  - EdgeIterators de-reference to a child type of
+  ///    std::pair<EdgeIdentifier, EdgeAttribute> to enable a standard library
+  ///    like metaphore.
+  ///  - Edges can be looked up by their EdgeIdentifier in amortised constant
+  ///    time.
----------------
Why is this documentation on this internal class, as opposed to documentation on the operations on the graph type? i.e. why is it important to mention that this is a child of std::pair<...> -- as opposed to saying why this is derived from std::pair<...> instead?

================
Comment at: include/llvm/XRay/Graph.h:90-95
+  ///  satisfied by this. However DenseMap allows us to define our own BucketT.
+  ///  We can then have a DenseMap of VertexIdentifier to DenseMaps with a
+  ///  custom BucketT. This gives us the third property and allows us to
+  ///  implement the prefered complexity. It also allows our iterators to have
+  ///  a simple implementation. This type implements the BucketT interface while
+  ///  having the inner Dense Map store EdgeValueTypes.
----------------
All this documentation seems like implementation detail documentation. This will not be helpful when the doxygen docs are generated and read by users of this type. Remember this is getting into the set of LLVM libraries, and can be used on its own as opposed to just serving the purposes of `llvm-xray graph` or similar requirements.

================
Comment at: include/llvm/XRay/Graph.h:98-112
+    VertexIdentifier &getFirst() {
+      return std::pair<EdgeIdentifier, EdgeAttribute>::first.second;
+    }
+
+    const VertexIdentifier &getFirst() const {
+      return std::pair<EdgeIdentifier, EdgeAttribute>::first.second;
+    }
----------------
We still don't have documentation as to "why" these member functions exist, and why we're not just using a `DenseMap<pair<VertexIdentifier, VertexIdentifier>, EdgeAttribute>`. This still isn't clear from the documentation that's been so far mentioned here. Also it's not clear why the adjacency list being stored isn't just an `unordered_map<VertexIdentifier, pair<VertexIdentifier, EdgeAttribute>>`, and lookup for all the adjacent vertices (or even out-edges) will be given by `equal_range(VertexIdentifier)`.

Since we're already adapting iterators anyway, I'm sure you can turn an iterator that yields a `pair<VertexIdentifier, pair<VertexIdentifier, EdgeAddtribute>>` into something that yields a `tuple<VertexIdentifier&, VertexIdentifier&, EdgeAttribute&>` with appropriate `const`-ness of members.

For in-edge and out-edge and attribute storage, you could have done it this way:

```
using EdgeAttributeContainer = std::list<EdgeAttribute>;
std::list<EdgeAttribute> EdgeAttributes;
unordered_multimap<VertexIdentifier, pair<VertexIdentifier, EdgeAttributeContainer::iterator>> OutEdges;
unordered_multimap<VertexIdentifier, pair<VertexIdentifier, EdgeAttributeContainer::iterator>> InEdges;
unordered_multimap<pair<VertexIdentifier, VertexIdentifier>, EdgeAttributeContainer::iterator>> EdgeLookup;
```

Since you're already defining your own iterators, they could have been simpler lookup, and still have the consistent interface.

I'm not saying this is how you should do it, I'm asking why this isn't what's being done, and the documentation (either internal for the implementer/reader or external generated from this type) isn't answering this question.

================
Comment at: include/llvm/XRay/Graph.h:139-140
+
+public:
+  using size_type = std::size_t;
+
----------------
Consider moving this and all public declarations together for easier reading.

================
Comment at: include/llvm/XRay/Graph.h:351-353
+    bool operator==(const const_iterator &b) const {
+      return (TopIt == b.TopIt && (TopIt == TopEnd || BotIt == b.BotIt));
+    }
----------------
Why can't this be defined as a non-member like so:

```
friend template <bool IsConstL, bool IsConstR>
bool operator==(const EdgeListIterator<IsConstL>&L, const EdgeListIterator<IsConstR>&R) {
  return ...;
}
```

This way it's defined with the class and it's also accessible only through ADL?

================
Comment at: include/llvm/XRay/Graph.h:355
+
+    bool operator!=(const const_iterator &b) const { return !((*this) == b); }
+
----------------
If you defined this as a non-member, it would look like:

```
friend bool operator!=(...) {
  return !(L == R);
}
```

================
Comment at: include/llvm/XRay/Graph.h:557
+  std::pair<EdgeIterator, bool> insert(const EdgeValueType &Val) {
+    auto p1 = Edges.FindAndConstruct(Val.first.first);
+    auto p2 = p1.second.insert({Val.first.second, Val.second});
----------------
LLVM Style wants local identifiers starting with capital letters.

================
Comment at: tools/llvm-xray/xray-graph.cc:300
+  for (auto &V : G.vertices()) {
+    assert(V.first == 0 || G[V.first].S.Sum != 0);
+    if (V.first != 0) {
----------------
Pro-tip: when asserting, provide some human-readable explanation for what the expectation is. As in:

```
assert((/* complicated conditional */) && "We want a pony, but got a unicorn!");
```

https://reviews.llvm.org/D29005