<div dir="auto"><div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 23 Jun 2020, 06:35 David Rector via cfe-dev, <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div>True, ParenExpr is the one oddball case: all syntax, no semantics, the exact opposite of most of the nodes that get in people’s way. </div><div><br></div><div>But I think Type::getAs<T>() is again instructive: one can call getAs<T>() on a ParenType to get its underlying type, and on an ElaboratedType etc. — nodes that are purely syntactic. So just allow the single getAs to also get a ParenExpr as its underlying expression. I believe it is the only such oddball case.</div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">It's not, there's a few types of node that we consider to be "parens", some of them only conditionally. These include things like _Generic and __builtin_choose_expr.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div>What we would probably also want is an analog or two of Type’s getCanonicalType, except for getting the underlying syntax:</div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">We already have quite a few different flavours of that -- look at Stmt::IgnoreImplicit and friends. There doesn't seem to be a good one-size-fits-all solution here, given the number of different variants of this we've invented. But maybe there's something good enough for most matchers at least.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div>Stmt {</div><div><span style="white-space:pre-wrap"> </span>…</div><div><span style="white-space:pre-wrap"> </span>T *getAs() {</div><div><span style="white-space:pre-wrap"> </span>if (isa<T>(this))</div><div><span style="white-space:pre-wrap"> </span>return cast<T>(this);</div><div><span style="white-space:pre-wrap"> </span>if (this->isImplicit() || isa<ParenExpr>(this)) {</div><div><span style="white-space:pre-wrap"> </span>assert(children.size() == 1 && "Expected implicit nodes to have exactly one child");</div><div><span style="white-space:pre-wrap"> </span>return (*children.begin())->getAs<T>();</div><div><span style="white-space:pre-wrap"> </span>}</div><div><span style="white-space:pre-wrap"> </span>return nullptr;</div><div><span style="white-space:pre-wrap"> </span>}</div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Note that the analogy to Type::getAs has already broken down here; getAs always returns a type with the same semantics but a possibly different meaning, whereas this can return an expression with different semantics by skipping implicit nodes.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div><div><span style="white-space:pre-wrap"> </span>Expr *getAsWritten() {</div><div><span style="white-space:pre-wrap"> </span>if (this->isImplicit())</div><div><span style="white-space:pre-wrap"> </span>return (*children.begin())->getAsWritten());</div><div><span style="white-space:pre-wrap"> </span>return this;</div><div><span style="white-space:pre-wrap"> </span>}</div><div><span style="white-space:pre-wrap"> </span>Expr *getAsWrittenIgnoreParens() {</div><div><span style="white-space:pre-wrap"> </span>if (isa<ParenExpr>(this))</div><div><span style="white-space:pre-wrap"> </span>return this;</div><div><span style="white-space:pre-wrap"> </span>return getAsWritten();</div><div><span style="white-space:pre-wrap"> </span>}</div></div><div>};</div><div><br></div><div>Decls too could at least in some cases benefit from the same interface:</div><div><br></div><div>struct A {</div><div><span style="white-space:pre-wrap"> </span>union { int i; float f; }</div><div>};</div><div><br></div><div>I believe the union becomes an implicit FieldDecl, whose type is the anonymous union’s CXXRecordDecl; a user may well want to call </div><div><br></div><div> if (auto AnonUnion = someFieldDecl->getAs<CXXRecordDecl>())</div><div><span style="white-space:pre-wrap"> </span>//...</div><div><br></div><div>to handle this case.</div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Taking it this far seems questionable to me. The field and the union are very different entities here, so I don't think the analogy to type sugar and parentheses really holds.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div>A getAs<T> in every AST node would provide a uniform way of handling implicit nodes across the AST, a definite benefit to beginners without any apparent cost to long time users.</div><div><br></div><div>Dave</div><div><br></div><div><br><blockquote type="cite"><div>On Jun 23, 2020, at 2:30 AM, Richard Smith <<a href="mailto:richard@metafoo.co.uk" target="_blank" rel="noreferrer">richard@metafoo.co.uk</a>> wrote:</div><br><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div dir="ltr">On Mon, 22 Jun 2020 at 11:44, David Rector via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" rel="noreferrer">cfe-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">> I would like to hear the opinions of others on these questions. I think<br>> we've both described our perspectives and made our cases.<br>><span> </span><br>><span> </span><br>> Just expanding on my position from earlier in this thread slightly: to<br>> directly address the "where to from here?" question, I would like us to get<br>> back to the state where, *in our default mode, the matcher for AST node X<br>> is always able to match all instances of AST node X*. I think there are<br>> various options for how we get there that seem acceptable, and that don't<br>> sacrifice your noble goals of making AST matching easier and more<br>> intuitive. In particular, I think it's fine (and probably good) if by<br>> default we look through non-matching implicit nodes while looking for a<br>> matching node (so long as we don't look past a matching node just because<br>> it's implicit). And I think it's fine (and probably good) to expose an easy<br>> way to explicitly control whether we automatically look through implicit<br>> nodes or not. But I think if no-one is prepared to do the work to make our<br>> default mode satisfy the above property while still looking through<br>> implicit nodes if (and only if) they don't match, then I think we should<br>> change the default back to the state we had before. (I don't have much of<br>> an opinion on whether to keep or remove 'traverse' and its various modes,<br>> other than that we've already caused a substantial amount of churn with the<br>> changes thus far, and removing them again would cause further churn.)<br>><span> </span><br><br>1. It would be helpful to rehash the examples one final time — give the code & the dump, a proposed matcher, then what option A would match, B, etc. Others should try to consider at each as would a beginner.<br><br>2. Perhaps this an opportunity to create a uniform interface to deal with implicit nodes in the AST. I believe any implicit node only ever has at most one child node. If so, consider this:<br> <span> </span>(a) Always have "Implicit*" versions of any implicit-able nodes (e.g. ImplicitCXXConstructExpr, in addition to CXXConstructExpr);<br> <span> </span>(c) Implement getAs<T>() for Decls and Stmts, the same we do for Types: e.g. E->getAs<CXXConstructExpr>() would skip over the ExprWCleanups, ImplicitCXXConstructExpr, etc. But you could still call E->getAs<ImplicitCXXConstructExpr>() too if desired.<br><br>Then, in the case of ASTMatchers, the matchers would always match the implicit nodes as before, but Results.Nodes.getNodeAs<…> would call getAs<…> on each actual result. I believe this aligns with the behavior you suggested Richard. But I have not fully thought this through, so I defer to others.<br></blockquote><div><br></div><div>I think that's a pretty interesting approach, that definitely has some appeal. I think we should also be thinking about how to serve the needs of clients that want a syntactic view or a semantic one -- some clients want to skip over implicit AST nodes (eg, nodes that describe semantics but no syntax), and others want to skip over what we call "parentheses" (eg, nodes that describe syntax but whose sole subexpression has the same semantic effect). A single getAs<...> might have a hard time serving both classes of user.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">My perspective on ASTMatchers, FWIW: I never got into them, because I had begun learning the AST first and it seemed redundant since you could do the same stuff with RecursiveASTVisitor, and there seemed to be a lot of extra infrastructure you had to deal with. I now see that the matchers can be considerably tighter and clearer -- but that is the only advantage over doing it the full way with RecursiveASTVisitor, so whatever the solution is, it must keep or increase that clarity advantage while retaining as much of the flexibility of RecursiveASTVisitor as possible. E.g. it sounded like Billy, the guy who had issues with Stmt traversal order (owing to the queueing of child statements, instead of stack processing them as most would expect), decided to just use RecursiveASTVisitor instead of deal with the hassle — that is the outcome to avoid.<br><br>_______________________________________________<br>cfe-dev mailing list<br><a href="mailto:cfe-dev@lists.llvm.org" target="_blank" rel="noreferrer">cfe-dev@lists.llvm.org</a><br><a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a></blockquote></div></div></div></blockquote></div><br></div>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" rel="noreferrer">cfe-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div></div></div>