<div dir="ltr"><div dir="ltr">On Sat, 20 Jun 2020 at 09:17, Stephen Kelly <<a href="mailto:steveire@gmail.com">steveire@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>On 18/06/2020 18:38, Richard Smith
wrote:<br></p>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">On Wed, 17 Jun 2020 at 15:50, Stephen Kelly via
cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>> wrote:<br>
</div>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>On 17/06/2020 22:47, Richard Smith wrote:<br>
</p>
<blockquote type="cite"> Consider the example from Yitzhak
Mandelbaum -- the issue there is actually not whether we
have a template or not; rather, it's a difference
between one- and two-argument constructors:<br>
</blockquote>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div>I hope you can agree that this change in
behavior, for an example such as this, is a
regression for the ability for beginners to
understand how matchers work and effectively write
them!</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>Yes, I see your point. The current state doesn't
account well for this case. <br>
</p>
<p>Would it resolve the problem if the CXXConstructExpr
was not excluded in this case (in the initializer of a
VarDecl), assuming that can be implemented? <br>
</p>
<p>I think it would remain better for newcomers not to see
(and match on) the CXXConstructExprs in <a href="https://godbolt.org/z/TvUTgy" target="_blank">https://godbolt.org/z/TvUTgy</a>
.</p>
</div>
</blockquote>
<div>I think if someone writes a cxxConstructExpr matcher,
they would expect it to match all cases that invoke a
constructor. This case does, so it should match. I think it
would be reasonable if, by default, both
returnStmt(hasReturnValue(integerLiteral())) and
returnStmt(hasReturnValue(cxxConstructExpr())) match this
case. The former matches when viewed syntactically, and the
latter matches when viewed semantically.</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>This is the intention of the change. The RFC makes several
references to the differences between syntax and semantics:
<a href="https://docs.google.com/document/d/17Z6gAwwc3HoRXvsy0OdwU0X5MFQEuiGeSu3i6ICOB90/edit#heading=h.l8zml1r0ls7u" target="_blank">https://docs.google.com/document/d/17Z6gAwwc3HoRXvsy0OdwU0X5MFQEuiGeSu3i6ICOB90/edit#heading=h.l8zml1r0ls7u</a></p>
<p>Perhaps it would be better if the name of the two modes were
different. ie rename:<br>
</p>
<p>traverse(AsIs, ...)</p>
<p>to <br>
</p>
<p>traverse(Semantic, ...)</p>
<p>and rename <br>
</p>
<p>traverse(IgnoreUnlessSpelledInSource, ...)</p>
<p>to <br>
</p>
<p>traverse(Syntactic, ...)</p>
<p>Thoughts on that?</p></div></blockquote><div>I think if we want to expose a syntactic mode, we should do that with a set of syntactic matchers (eg, a matcher that matches parenthesized initialization). Suppose someone wants to match all direct initialization. Right now, they need to do lots of checks: for a non-list, non-implicit cxxConstructExpr, for a varDecl whose initialization kind is direct init, for a cxxTemporaryObjectExpr, for a functionalCastExpr, and there'll likely be other kinds that I forgot and more added in the future.</div><div><br></div><div>I don't think it's intuitive to call IgnoreUnlessSpelledInSource mode Syntactic, because it isn't really that -- it doesn't let you match syntax, it lets you match semantics-with-associated-tokens (and even that is only an approximate description). I think it would be really interesting to add a way to actually match syntax in a semantics-free way, but it seems like a big project.</div><div><br></div><div>Also the mode you're calling Semantic is also not a semantic match, because it also matches syntax-only nodes. (It doesn't implicitly IgnoreParens or anything like that.)</div><div><br></div><div>That said, I think neither Semantic nor Syntactic is the right default. By default we should be assuming that matchers want to match a combination of syntax and semantics -- usually a matcher will want to key off some semantic effect obtained in a particular syntactic context.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>I think the best refinement for now would be to restore
the CXXConstructExpr in the case of a varDecl
initializer, if that is possible (may not be).</p>
</div>
</blockquote>
<div>I think that is addressing a symptom rather than the
cause. I think the root cause is that a matcher that is
explicitly asking to match a certain implicit (or
sometimes-implicit) AST node does not match. </div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>That is the purpose of AsIs/Semantic mode if I understood you
correctly.</p>
<p>The intention of the IgnoreUnlessSpelledInSource/Syntactic node
is to *not* match certain implicit (or sometimes-implicit) AST
nodes.<br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div>For example, I would expect that implicitCastExpr()
*never* matches under the new default behavior. </div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>That is correct.<br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div>And I think that users, and especially beginner users,
will see that as being simply broken -- if someone tries to
write an implicitCastExpr() matcher, it's obvious that they
want to match implicit casts, and we are not doing the user
any favors by making that matcher not match by default.<br>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>Hmm, if someone is the kind of newcomer that they've never
encountered a CallExpr, FunctionDecl or any other AST node in
their life before, I don't see why implicitCastExpr() would be the
first, or one of the first, things they try.</p></div></blockquote><div>Perhaps because they want to match an implicit cast, and they find it in the documentation. Perhaps because they read one of the guides that says "look at the AST dump and write matchers to match what you see there".<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<p>There may be different/multiple levels of newcomer that we each
have in mind here. <br>
</p>
<p>If someone is a kind of newcomer who wants to match on an
implicitCastExpr(), then they're probably also the kind of
newcomer who knows that's an implicit node and they should use
Semantic node instead of Syntactic mode. Do you agree?</p></div></blockquote><div>No. I think it's bad API design to have any mode in which implicitCastExpr compiles but doesn't ever match.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>If I understand your suggestion, given</p>
<p><br>
```<br>
struct B<br>
{<br>
B(int);<br>
};<br>
<br>
B func1() {<br>
return 42;<br>
}<br>
```<br>
</p>
<p>a matcher like `returnStmt(hasReturnValue(fooExpr()))`<br>
</p>
<p>would match for `fooExpr` as any of<br>
</p>
<br>
* exprWithCleanups()<br>
* cxxConstructExpr() (twice)<br>
* materializeTemporaryExpr()<br>
* implicitCastExpr()<br>
* integerLiteral()
<p>right?<br>
</p>
</div>
</blockquote>
<div>Yes.</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p> </p>
<p>and a matcher like
`returnStmt(hasReturnValue(expr().bind("theReturnVal")))`</p>
<p>would bind "theReturnVal" to the `exprWithCleanups()`
right?<br>
</p>
<p>Your suggestion may be that it should bind to the
integerLiteral() in that case. That may have more scope
for confusion though?</p>
</div>
</blockquote>
<div>I think that's an interesting question. My inclination
would be to bind to the ExprWithCleanups. Simple
syntax-oriented transformations should be fine with this;
the source range of the implicit nodes should be the same as
that of their inner nodes. And semantics-oriented
transformations need this, in order to be able to inspect
the semantics of the node they captured. (Fancy
syntax-oriented transformations that are going to inspect
the clang AST node they get back, rather than only applying
matchers and rewrites, will need to be aware that they might
get an implicit Expr* node. But once you're looking at the
Expr* yourself, you've stepped out of the "beginner" area
and are going to need to know more about how the Clang AST
works in general.)<br>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>This might be an area for further research, and possibly a third
mode in addition to the Syntactic and Semantic modes.<br>
</p>
<p>Thanks,</p>
<p>Stephen.<br>
</p>
<p><br>
</p>
</div>
</blockquote></div></div>