[PATCH] Introduction of ASTMatcher sub-expression references

Manuel Klimek klimek at google.com
Mon May 6 01:16:21 PDT 2013


  First, thanks a lot for working on this - I think this is going to be awesome!
  Some minor comments inside, but first let's talk about the design. I'm not convinced yet we can't solve this in a much simpler way :)

  The main problem I currently see is that we build up BoundNodesTrees through BoundNodesBuilders without their parent context, and only add them after they match.

  Now there are 2 things about the design of the proposed solution I'm unhappy with:
  1. the ref-counted pointer design: the BoundNodesTreeBuilders are always on the stack, and always in a hierarchy that mirrors the call hierarchy - thus, I think ownership is very clear and straight forward, and I'd think we can solve this problem without introducing less clear ownership semantics
  2. the child BoundNodesTrees are always built in the cause of one function (they're on the stack), and knowing their parents (as they always will be added to that parent in the end); so I think adding a simple parent pointer should be enough to solve the immediate problem

  Then, there's some fundamental design decisions that I think we should discuss first. The basic stuff is pretty straight forward, as there are actually no branches in the BoundNodesTree.
  For branches in the BoundNodesTree there are 2 different cases:
  a) branches due to has() and hasDescendant(). The interesting point here is that there is only one match in the end, and the branching is mainly there so we can roll back sub-matches in case we need to follow a different branch higher up
  b) branches due to forEach() and forEachDescendant(). Those are the really hard cases :) For example:
  stmt(
    forEachDescendant(stmt().bind("x")),
    forEachDescendant(stmt(equalsBoundNode("x"))))
  Which combinations do we want to test here? You're basically defining it to "first" in the first match (probably as you want to avoid the combinatorial explosion), but I'm not sure yet that's the right approach. If we want that, I feel like there should be a much simpler way to implement it. Anyway, I'd like to have that strategy pinned down in one place in the code if at all possible, so we can change it if it turns out that users find it non-intuitive.

  My current intuition is: checking against the *first* in a previous forEach branch is bad - this goes against the "imperative" intuition that the last iteration in a loop overwrites a value.
  I think we should either do the full combinatorial explosion when going across forEach* boundaries, or we should not support it at all - I'd like to keep as much as possible "functional style".


================
Comment at: include/clang/ASTMatchers/ASTMatchers.h:3343
@@ +3342,3 @@
+                                              llvm::StringRef>
+equalsBoundNode(StringRef id) {
+  return internal::PolymorphicMatcherWithParam1<
----------------
s/id/ID/

================
Comment at: include/clang/ASTMatchers/ASTMatchersInternal.h:71
@@ -67,1 +70,3 @@
+
+public:
   /// \brief Adds \c Node to the map with key \c ID.
----------------
Why the extra public:?

================
Comment at: include/clang/ASTMatchers/ASTMatchersInternal.h:174
@@ -159,2 +173,3 @@
 public:
-  BoundNodesTreeBuilder();
+  class Grafter : public llvm::RefCountedBase<Grafter> {
+  public:
----------------
I really dislike the name Grafter. First, for a non-native speaker it's a rather rare word to be used, and second, even after looking up its meaning, the name doesn't tell me anything about what this class does.

================
Comment at: unittests/ASTMatchers/ASTMatchersTest.cpp:3946
@@ +3945,3 @@
+
+TEST(SameAsMatcher, UsingForEachDescendant) {
+  const std::string Fragment = "\
----------------
General notes on the tests:
1. Please use one string literal per line, as that makes it consistent with the rest of the file
2. I am doubtful the tests always have to be that complex (especially they contain too many different things). In general, try to write tests that are as small as possible to test one thing, and use as few C++ features as possible to achieve that (minimize number of types / constructs used).


http://llvm-reviews.chandlerc.com/D740



More information about the cfe-commits mailing list