<div dir="ltr">On Tue, Jul 15, 2014 at 2:59 AM, Kevin Funk <span dir="ltr"><<a href="mailto:kfunk@kde.org" target="_blank">kfunk@kde.org</a>></span> wrote:<div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5">On Monday 14 July 2014 17:52:42 Richard Smith wrote:<br>

> On Mon, Jul 14, 2014 at 4:53 PM, Kevin Funk <<a href="mailto:kfunk@kde.org">kfunk@kde.org</a>> wrote:<br>

> > Hey,<br>

> ><br>

> > I'm a bit puzzled by the following behavior of clang when inspecting the<br>

> > AST<br>

> > for the following code snippet:<br>

> ><br>

> > test.cpp:<br>

> > char foo;<br>

> > char bar = foo1;<br>

> ><br>

> > $ clang++ -cc1 -ast-dump test.cpp:<br>

> > main.cpp:2:12: error: use of undeclared identifier 'foo1'; did you mean<br>

> > 'foo'?<br>

> > (...)<br>

> > `-VarDecl 0xa19cb0 <line:2:1, col:12> col:6 bar 'char' cinit<br>

> ><br>

> >   `-ImplicitCastExpr 0xa19d60 <col:12> 'char' <LValueToRValue><br>

> ><br>

> >     `-DeclRefExpr 0xa19d38 <col:12> 'char' lvalue Var 0xa19c40 'foo'<br>

> >     'char'<br>

> ><br>

> > So, Clang seems to interpret 'foo1' as a typo 'foo'. This is still fine.<br>

> > However, Clang also "fixes up the code" and pretends that 'foo1' *is*<br>

> > 'foo'<br>

> > which -ast-dump shows (see last line of the dump). This is odd.<br>

> ><br>

> > Obviously, this is only the case if -fno-spell-checking is *not* passed to<br>

> > clang++. With -fno-spell-checking I get this instead:<br>

> ><br>

> > $ clang++ -cc1 -ast-dump test.cpp -fno-spell-checking<br>

> > (...)<br>

> > `-VarDecl 0x1a79ca0 <line:2:1, col:6> col:6 bar 'char'<br>

> ><br>

> > This looks fine to me.<br>

> ><br>

> > So, my question: Is this intended behavior? Is this for error recovery in<br>

> > the<br>

> > parser?<br>

><br>

> Yes, this is intended. It is our policy that whenever we emit an error<br>

> message with a fix-it hint, we recover as if that fix-it hint had been<br>

> applied to the AST. This is not limited to typo-correction; it happens<br>

> whenever we emit an error with a fix-it.<br>

><br>

> If true, is there a way to both enable spell-checking but to *disable* it<br>

><br>

> > touching the AST?<br>

><br>

> Not currently, no. I think what you're really asking for is for us to turn<br>

> off all error recovery, so the AST contains only things that were in the<br>

> source code, and omits erroneous things rather than trying to recover from<br>

> errors?<br>

<br>

</div></div>Yes. That sounds interesting.<br>

<div class=""><br>

> We could try to support such a mode, but I suspect that it would<br>

> degrade the diagnostic experience enough that you may want to run Clang<br>

> twice: once in this mode, and once to produce user-facing diagnostics.<br>

<br>

</div>I fear the performance impacts here. Running it twice over a possibly large TU<br>

won't be possible for us.<br>

<div class=""><br>

> Also, I don't expect that you'll find volunteers in the Clang community to<br>

> do the work, but if the design and rationale are reasonable and you can<br>

> provide a convincing argument that the mode will have sufficient ongoing<br>

> support work, we'd accept patches to implement this.<br>

<br>

</div>Well appreciated.<br>

<div class=""><br>

> Reminder: I'm working on integrating Clang (read: libclang) in KDevelop,<br>

><br>

> > hence<br>

> > I'm looking at this from an development tool perspective. Magic behavior<br>

> > inside Clang is somewhat undesirable there :)<br>

><br>

> Can you explain a bit more about your use cases? I assume you want to<br>

> provide some kind of AST introspection on not-necessarily-valid code, and<br>

> you don't want to expose our typo-correction results in that? Depending on<br>

> what you're trying to achieve, there might be lighter-weight approaches<br>

> (for instance, checking if the token at the given location actually refers<br>

> to the right name).<br>

<br>

</div>Okay. You're right in a sense that we want to support introspecting half-<br>

broken code in our IDE.<br>

<br>

Suppose the following code snippet:<br>

<br>

  class Foo {<br>

    void bar();<br>

  };<br>

<br>

  void Foo::bar1() {}<br>

<br>

Now when parsing this we'd like to have the following state in our code model:<br>

- Be aware that 'bar1' might be a typo<br>

- Be aware that this decl is named "Foo::bar1" right now, not "Foo::bar"<br></blockquote><div><br></div><div>OK. If we had a mechanism you could ask "was this cursor's name typo-corrected?", would that suffice?</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Regarding playing around with tokens to extract the identifier: This will<br>

likely slow down the parser dramatically, because this would have to be done<br>

for every declaration.</blockquote><div><br></div><div>I don't know about "dramatically"; Clang can answer this sort of query quite efficiently.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

After all, I cannot decide whether the state of an<br>

individual declaration in the AST is actually "sane" when spell-checking is<br>

enabled.<br>

<br>

Does that make sense to you?</blockquote><div><br></div><div>I think so. One more question: in your example above, do you care whether we believe that the two functions are redeclarations of the same entity? (Do you want to provide 'add a declaration of this to the class'-style functionality, for instance?)</div>

<div><br></div><div>Suppose we could typo-correct this:</div><div><br></div><div>struct X {</div><div>  void f() const;</div><div>};</div><div>void X::f() {}</div><div><br></div><div>... to insert the missing 'const'. Would that be a problem for you too?</div>

</div></div></div>