[PATCH] D90023: [Syntax] Add iterators over children of syntax trees.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 26 08:54:55 PDT 2020


sammccall marked 2 inline comments as done.
sammccall added inline comments.


================
Comment at: clang/include/clang/Tooling/Syntax/Tree.h:182
+    /// The element, or nullptr if past-the-end.
+    NodeT *asPointer() const { return N; }
+  };
----------------
gribozavr2 wrote:
> "nodePointer()" ?
I find this a little confusing; it suggests to me there's also something *other* than the node here.
What aspect does it help with?

I can live with it, though currently would prefer `pointer()`, `getPointer()` or `asPointer()`.


================
Comment at: clang/include/clang/Tooling/Syntax/Tree.h:202-219
+  /// child_iterator is not invalidated by mutations.
+  struct child_iterator : child_iterator_base<child_iterator, Node> {
+    using Base::Base;
+  };
+  struct const_child_iterator
+      : child_iterator_base<const_child_iterator, const Node> {
+    using Base::Base;
----------------
gribozavr2 wrote:
> sammccall wrote:
> > eduucaldas wrote:
> > > TL;DR:
> > > `child_iterator` -> `ChildIterator`
> > > `const_child_iterator` -> `ConstChildIterator`
> > > `children` -> `getChildren`
> > > 
> > > I see you followed the naming convention of the ClangAST, which makes loads of sense. 
> > > 
> > > In syntax trees we follow the naming conventions of the [conding standard](https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly) -- just because we want consistency and we had to pick a guideline to follow -- according to that, functions should be verb like and names should be camel case.
> > > 
> > > If like us you chose `const_child_iterator` and `children` kind of "just because", could you change it to follow `ConstChildIterator` and `getChildren` respectively. 
> > > In syntax trees we follow the naming conventions of the conding standard
> > 
> > I see that's been used in some of the newer classes, and that you changed this file to follow that style in 4c14ee61b73746b314d83e7c52e03d6527b78105. However it's not the original style we used here and elsewhere in libSyntax.
> > It's a bit frustrating to hear the argument about consistency, because we were quite careful and deliberate about this style, which was IMO fairly consistent prior to that change.
> > 
> > I'm willing to change these to match the local style in this file. However `*_iterator`, `*_begin/end/range()` is more common than FooIterator etc in LLVM overall (I think for consistency with `iterator` etc, which is endorsed by the style guide). So I don't think this change is particularly an improvement, even in terms of consistency.
> It can go either way:
> 
> > As an exception, classes that mimic STL classes can have member names in STL’s style of lower-case words separated by underscores
> 
Yeah, though read narrowly that doesn't apply (Tree doesn't really mimic STL, and the rule applies to Tree::ChildIterator's members rather than the class itself).

I think it's rare enough to spell out iterator class names that those don't matter at all, and getChildren() is necessary given getParent() etc.


================
Comment at: clang/lib/Tooling/Syntax/Tree.cpp:22-23
   if (auto *T = dyn_cast<syntax::Tree>(N)) {
-    for (const auto *C = T->getFirstChild(); C; C = C->getNextSibling())
-      traverse(C, Visit);
+    for (const syntax::Node &C : T->children())
+      traverse(&C, Visit);
   }
----------------
gribozavr2 wrote:
> sammccall wrote:
> > eduucaldas wrote:
> > > Hmm... 
> > > That looks funny, if the user uses a range-based for loop and forgets the `&`, then we would be copying the Node???
> > > 
> > > Also in many places we had to get the address to the node. That is not really consistent with how the syntax trees API's were designed, they generally take pointers instead of references. (that said we could obviously reconsider those design choices)
> > > That looks funny, if the user uses a range-based for loop and forgets the &, then we would be copying the Node???
> > 
> > It doesn't look funny to me - foreach usually iterates over values rather than pointers. This raises a good point - it looks like Node is copyable and shouldn't be. (The default copy operation violates the tree invariants, e.g. the copied node will not be a child of its parent). I'll send a patch...
> > 
> > (That said this is always the case with copyable objects in range-based for loops, and indeed using references - that doesn't usually mean we avoid using them)
> > 
> > > Also in many places we had to get the address to the node. That is not really consistent with how the syntax trees API's were designed, they generally take pointers instead of references
> > 
> > I'm not convinced that taking the address is itself a burden, any more than using `->` instead of `.` is a burden.
> > Sometimes the use of pointer vs reference intends to convey something about address stability, but here this goes with the type as all nodes are allocated on the arena. (IMO this is a part of the AST design that works well).
> > The other thing it conveys is nullability, and this seems valuable enough that IMO APIs that take *non-nullable* nodes by pointer should be reconsidered.
> > 
> > For the iterators specifically, the main alternative here I guess is having Tree::children() act as a container of *pointers to* children, instead of the children themselves. This *is* highly unconventional and surprising. Consider the ubiquitous `for (const auto&N : T->children())`... now the author/reader expects N to be a const reference to a child, instead it's a const reference to a **non-const** pointer to a child....
> > That looks funny, if the user uses a range-based for loop and forgets the &, then we would be copying the Node???
> 
> Aren't they noncopyable?
Noncopyable after D90163.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90023/new/

https://reviews.llvm.org/D90023



More information about the cfe-commits mailing list