[PATCH] Basic correction of "-" or ">" to "->" (PR9054)
rikka at google.com
Sat Nov 2 09:27:53 PDT 2013
Barring any replies with concerns, ideas for improving the patch, or
suggestions for a differrent approach, I'll probably commit it on Monday.
On Fri, Nov 1, 2013 at 9:42 AM, Kaelyn Uhrain <rikka at google.com> wrote:
> The trouble I was having wasn't so much about giving Sema enough context
> to make the correction, but about being able to hit the right code paths at
> the right times to be able to 1) have some confidence that the correction
> is even semi-reasonable, and 2) to be able to suppress duplicate/extraneous
> errors and ideally recover from the typo.
> For "foo->bar", the parser handles "foo" as the LHS by calling
> Parser::ParseCastExpression that eventionally calls down into Sema and
> comes back, then ParseCastExpression sees the arrow and calls
> ParsePostfixExpressionSuffix which handles the member lookup of "bar" with
> Sema's help as the final part of building the LHS. Then the parser goes
> back out to Parser::ParseAssignmentExpression (which called
> ParseCastExpression to create the LHS expression) and calls
> Parser::ParseRHSOfBinaryExpression for the pieces of the expression that
> come after "foo->bar".
> If "foo->bar" is mistyped as "foo-bar" or "foo>bar", the parser handles
> "foo" as above, but returns back to ParseAssignmentExpression and calls
> ParseRHSOfBinaryExpression to handle "-bar"/">bar". Then it isn't until
> after the "-" or ">" has been parsed and the parser is calling into Sema to
> figure out what "bar" is that an error is encountered. To recover from the
> error at the point Sema encounters it, Sema would have to be able to tell
> the parser to undo the parsing of RHS of a binary expression and the
> operator that triggered it, redo the parsing of the LHS enough to call
> ParsePostfixExpressionSuffix for the post-recovery "->bar", and go on to
> re-invoke ParseRHSOfBinaryExpression on whatever comes after "bar". Or, as
> in my patch, the parser can preemptively check for the conditions under
> which the error may occur (with the assumption that minus and greater-than
> are operations rarely performed on pointers to record objects in valid code
> and so the overhead in such a situation is acceptable) to see whether "bar"
> by itself refers to anything, and if the lookup fails and treating bar as a
> member works, assume the "-" or ">" was intended to be "->".
> On Fri, Nov 1, 2013 at 12:55 AM, Serge Pavlov <sepavloff at gmail.com> wrote:
>> Another approach is to inform Sema about context where the unknown name
>> occurs. That would allow typo correction code to be gathered in one place
>> in Sema. Such kind of typos could be processed by the same machinery in
>> ActOnIdExpression, which now tries to make correction of misspelled names.
>> There are other typos that are nice to handle ("." vs ".*", dot instead
>> of comma etc). Handling them in Parser could make the latter bulky.
>> 2013/11/1 Kaelyn Uhrain <rikka at google.com>
>>> Attached is an initial patch for trying to correct a missing "-" or
>>> ">" to "->" when accessing a member through an object pointer. This patch
>>> also doesn't work for C code as C seems to hit a different code path. I'm
>>> sending the patch out for pre-commit review even though it is a small and
>>> fairly unobtrusive (code-wise) patch because I'm a bit iffy on whether it's
>>> a good way to perform the diagnostic.
>>> For a bit of context, what makes this diagnostic tricky is that the
>>> original error about the unknown identifier after the "-" or ">" occurs
>>> well within Sema as the parser is handling the RHS of a binary operator,
>>> but the recovery would require following a code path in the parser that was
>>> part of the construction of the LHS. And since Sema cannot tell the parser
>>> to back up a few steps....
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-commits