r194002 - Try to correct a mistyped "-" or ">" to "->" for some C++ cases.

Richard Smith richard at metafoo.co.uk
Tue Nov 5 11:37:45 PST 2013


On Tue, Nov 5, 2013 at 10:40 AM, Kaelyn Uhrain <rikka at google.com> wrote:

> On Tue, Nov 5, 2013 at 10:21 AM, Richard Smith <richard at metafoo.co.uk>wrote:
>
>> On Tue, Nov 5, 2013 at 9:56 AM, Kaelyn Uhrain <rikka at google.com> wrote:
>>
>>> On Mon, Nov 4, 2013 at 6:37 PM, Richard Smith <richard at metafoo.co.uk>wrote:
>>>
>>>>
>>>> On 4 Nov 2013 11:05, "Kaelyn Uhrain" <rikka at google.com> wrote:
>>>> >
>>>> > Author: rikka
>>>> > Date: Mon Nov  4 12:59:34 2013
>>>> > New Revision: 194002
>>>> >
>>>> > URL: http://llvm.org/viewvc/llvm-project?rev=194002&view=rev
>>>> > Log:
>>>> > Try to correct a mistyped "-" or ">" to "->" for some C++ cases.
>>>> >
>>>> > Similar C code isn't caught as it seems to hit a different code path.
>>>> > Also, as the check is only done for record pointers, cases involving
>>>> > an overloaded operator-> are not handled either. Note that the reason
>>>> > this check is done in the parser instead of Sema is not related to
>>>> > having enough knowledge about the current state as it is about being
>>>> > able to fix up the parser's state to be able to recover and traverse
>>>> the
>>>> > correct code paths.
>>>> >
>>>> > Modified:
>>>> >     cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
>>>> >     cfe/trunk/lib/Parse/ParseExpr.cpp
>>>> >     cfe/trunk/test/SemaCXX/member-expr.cpp
>>>> >
>>>> > Modified: cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
>>>> > URL:
>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td?rev=194002&r1=194001&r2=194002&view=diff
>>>> >
>>>> ==============================================================================
>>>> > --- cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td (original)
>>>> > +++ cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td Mon Nov  4
>>>> 12:59:34 2013
>>>> > @@ -499,6 +499,9 @@ def ext_abstract_pack_declarator_parens
>>>> >  def err_function_is_not_record : Error<
>>>> >    "unexpected '%select{.|->}0' in function call; perhaps remove the "
>>>> >    "'%select{.|->}0'?">;
>>>> > +def err_mistyped_arrow_in_member_access : Error<
>>>> > +  "use of undeclared identifier %0; did you mean '->' instead of "
>>>> > +  "'%select{-|>}1'?">;
>>>> >
>>>> >  // C++ derived classes
>>>> >  def err_dup_virtual : Error<"duplicate 'virtual' in base specifier">;
>>>> >
>>>> > Modified: cfe/trunk/lib/Parse/ParseExpr.cpp
>>>> > URL:
>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseExpr.cpp?rev=194002&r1=194001&r2=194002&view=diff
>>>> >
>>>> ==============================================================================
>>>> > --- cfe/trunk/lib/Parse/ParseExpr.cpp (original)
>>>> > +++ cfe/trunk/lib/Parse/ParseExpr.cpp Mon Nov  4 12:59:34 2013
>>>> > @@ -166,6 +166,46 @@ ExprResult Parser::ParseAssignmentExpres
>>>> >    ExprResult LHS = ParseCastExpression(/*isUnaryExpression=*/false,
>>>> >                                         /*isAddressOfOperand=*/false,
>>>> >                                         isTypeCast);
>>>> > +
>>>> > +  // Check for a possible typo of "-" or ">" instead of "->" after a
>>>> > +  // pointer to a struct or class, while recovery is still possible.
>>>> > +  if (LHS.isUsable() && (Tok.is(tok::minus) ||
>>>> Tok.is(tok::greater))) {
>>>> > +    QualType LHSType = LHS.get()->getType();
>>>> > +    const RecordType *Pointee =
>>>> > +        LHSType->isPointerType()
>>>> > +            ? LHSType->getPointeeType()->getAsStructureType()
>>>> > +            : 0;
>>>> > +    const RecordDecl *RD = Pointee ? Pointee->getDecl() : 0;
>>>> > +    const Token &NextTok = NextToken();
>>>> > +    if (RD && NextTok.is(tok::identifier)) {
>>>> > +      UnqualifiedId Name;
>>>> > +      CXXScopeSpec ScopeSpec;
>>>> > +      SourceLocation TemplateKWLoc;
>>>> > +      NoTypoCorrectionCCC NoTCValidator;
>>>> > +      Name.setIdentifier(NextTok.getIdentifierInfo(),
>>>> NextTok.getLocation());
>>>> > +      Sema::SFINAETrap Trap(Actions);
>>>> > +      ExprResult Res =
>>>> > +          Actions.ActOnIdExpression(getCurScope(), ScopeSpec,
>>>> TemplateKWLoc,
>>>> > +                                    Name, false, false,
>>>> &NoTCValidator);
>>>> > +      if (Res.isInvalid()) {
>>>>
>>>> What happens if the next token is ( or :: or similar, and this
>>>> identifier is not an id-expression by itself? For instance, p-f(x) might
>>>> find f by adl.
>>>>
>>> Assuming (1) "f" is both the name of a member of "p" and something not
>>> found through a simple lookup but through adl or similar, and (2) the
>>> pointer arithmetic/comparison on an object pointer is intended, then yes
>>> this will probably do the wrong thing. And while it's not great, working
>>> around that should be pretty trivial (e.g. adding parentheses around the
>>> rhs of the "-" or ">").
>>>
>>
>> OK, so this means we reject valid code; that's not OK -- please fix or
>> revert. Here's one example of valid code this patch rejects:
>>
>> struct S { int f(...); };
>> namespace X { struct T {}; int f(T); }
>> S *g(S *p) { return p - f(X::T()); }
>>
>> This also produces a diagnostic with garbage in it:
>>
>>  error: use of undeclared identifier '������'; did you mean '->' instead
>> of '-'?
>>
>
> Reverted the commit until a way to handle cases like these can be figured
> out. You wouldn't happen to have any suggestions for how to detect cases
> like the one you gave, do you? The help would be much appreciated.
>

The easy thing to do would be to form a blacklist of problematic next
tokens. This would need to include at least '(' and '::' -- though
blacklisting '(' seems to remove a lot of the value from this diagnostic,
sadly.

I wonder if you can move this into ParseRHSOfBinaryExpression and wrap some
kind of diagnostic trap/buffer around the ParseAssignmentExpression call
there. Then, if that fails, try to interpret the - or > as a -> instead.
You'd need to make sure that the diagnostic trap is cheap enough to not
cause problems in such a hot code path, though.

Maybe you could introduce a mechanism to register a callback to be invoked
if an error is produced; from that mechanism, you could check whether
interpreting the expression as a member expression would work. If it would,
produce your new diagnostic from there and suppress all further diagnostics
until you get back to ParseRHSOfBinaryExpr.


>
>
>>
>>  Also, in the case where this succeeds, we should annotate the produced
>>>> expression onto the token stream to avoid reparsing it later.
>>>>
>>> I'm not sure what you're saying here... care to explain?
>>>
>>
>> If you're going to the effort to produce an Expr from the identifier,
>> please replace the tokens you used with a tok::annot_primary_expr token so
>> that we won't need to redo the lookup and recreate the Expr when we
>> actually parse the RHS.
>>
>
>  From my understanding, when the correction is performed, the RHS parsed
> by ParseRHSOfBinaryExpression is for what comes after the member expr (so
> after "p->f" which "p-f" was corrected to)--and the parsing has no more
> overlap than if the arrow hadn't been mistyped. Or are you referring to the
> RHS of the arrow and the call to ParsePostfixExpressionSuffix, in which
> case the lookup of the member expr has to be done anyway since the
> identifier lookup had failed?
>

The case I'm referring to is when the correction is *not* performed, and
we've already correctly parsed the id-expression that follows. In that
case, the current approach would throw away its Expr*, and then immediately
re-parse the exact same subexpression. But if the approach I suggested
above works, this is moot because there's no reparsing involved.


>
>>
>>>  > +        Token OpTok = Tok;
>>>> > +        Tok.setKind(tok::arrow);
>>>> > +        PP.EnableBacktrackAtThisPos();
>>>> > +        Res = ParsePostfixExpressionSuffix(LHS);
>>>> > +        if (Res.isUsable()) {
>>>> > +          LHS = Res;
>>>> > +          PP.CommitBacktrackedTokens();
>>>> > +          Diag(OpTok, diag::err_mistyped_arrow_in_member_access)
>>>> > +              << NextTok.getIdentifierInfo() <<
>>>> OpTok.is(tok::greater)
>>>> > +              << FixItHint::CreateReplacement(OpTok.getLocation(),
>>>> "->");
>>>> > +        } else {
>>>> > +          Tok = OpTok;
>>>> > +          PP.Backtrack();
>>>> > +        }
>>>> > +      }
>>>> > +    }
>>>> > +  }
>>>> > +
>>>> >    return ParseRHSOfBinaryExpression(LHS, prec::Assignment);
>>>> >  }
>>>> >
>>>> >
>>>> > Modified: cfe/trunk/test/SemaCXX/member-expr.cpp
>>>> > URL:
>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCXX/member-expr.cpp?rev=194002&r1=194001&r2=194002&view=diff
>>>> >
>>>> ==============================================================================
>>>> > --- cfe/trunk/test/SemaCXX/member-expr.cpp (original)
>>>> > +++ cfe/trunk/test/SemaCXX/member-expr.cpp Mon Nov  4 12:59:34 2013
>>>> > @@ -224,3 +224,16 @@ namespace pr16676 {
>>>> >          .i;  // expected-error {{member reference type 'pr16676::S
>>>> *' is a pointer; maybe you meant to use '->'}}
>>>> >    }
>>>> >  }
>>>> > +
>>>> > +namespace PR9054 {
>>>> > +struct Foo {
>>>> > +  void bar(int);
>>>> > +  int fiz;
>>>> > +};
>>>> > +
>>>> > +int test(struct Foo *foo) {
>>>> > +  foo-bar(5);  // expected-error {{use of undeclared identifier
>>>> 'bar'; did you mean '->' instead of '-'?}}
>>>> > +  foo>baz(4);  // expected-error-re {{use of undeclared identifier
>>>> 'baz'$}}
>>>> > +  return foo>fiz;  // expected-error {{use of undeclared identifier
>>>> 'fiz'; did you mean '->' instead of '>'?}}
>>>> > +}
>>>> > +}
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > cfe-commits mailing list
>>>> > cfe-commits at cs.uiuc.edu
>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131105/df461737/attachment.html>


More information about the cfe-commits mailing list