[cfe-commits] Patch: Add fix-it for digraph errors involving &quot; <::&quot;

Richard Trieu rtrieu at google.com
Mon Apr 4 17:19:37 PDT 2011


Sorry for the mess up, but I accidentally did a reply instead of reply-all.
 This includes my response to Richard Smith, his reply to me, and my reply
to his reply.

On Mon, Apr 4, 2011 at 4:39 PM, Richard Smith <richard at metafoo.co.uk> wrote:

> On Mon, April 4, 2011 22:57, Richard Trieu wrote:
> > On Mon, Apr 4, 2011 at 1:19 PM, Richard Smith <richard at metafoo.co.uk>
> > wrote:
> >> On Wed, 30 Mar 2011, Richard Trieu <rtrieu at google.com> wrote:
> >>> Detect when the string "<::" is found in code after a cast or
> >>> template name and is interpreted as "[:" because of the digraph "<:".
> >>> When found,
> >>> give an error with a fix-it to add whitespace between the "<" and
> >>> "::".
> >>> Also, repair the token stream and recover parsing afterwards.
> >>
> >> I don't think that this is the right fix. In C++0x, '<::' (not followed
> >> by ':' or '>') is lexed as '<' followed by '::' (see [lex.pptoken]p3
> >> bullet 2, or
> >> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1104).
> >> Implementing that rule (for both C++0x and earlier) seems to me like a
> >> better fit, even though it breaks legal-but-pathological C++98 programs
> >> like this:
> >>
> >> int a[42]; #define A(n) a<::##:n:>
> >> int b = A(b);
> >
> > I don't think it's a good idea to break valid code.  Also, your link
> > shows that fix in the lexer is not yet in the standard.
>
> Have a look at [lex.pptoken]p3 in N3242, the latest public draft of the
> C++0x standard. It's there already :)
>
Ah, okay.  I see it now.  I thought that you had linked to a discussion of
possible features.

>
> > Even if it were, it
> > should only be applied to C++0x with the parser handling this error in
> > other versions.
>
> I don't have a particularly strong opinion on this. Personally, I'm much
> more interested in recovering nicely for code which accidentally contains
> '<::' and means '< ::' than I am in preserving the semantics of
> theoretical code where '<::' is supposed to be a digraph, but I'm not sure
> what's the best fit for clang's goals (my preference would be that this is
> an ExtWarn with a fixit outside of C++0x).
>
> If you still want to implement it this way, I have some comments on the
> patch itself:
>
> It would be nice to also cover the comparison case: '::a <::b'.

Maybe later.  This patch was limited to prevent unforeseen impacts to other
code.

>
> Index: lib/Parse/ParseExprCXX.cpp
> ===================================================================
> --- lib/Parse/ParseExprCXX.cpp  (revision 128572)
> +++ lib/Parse/ParseExprCXX.cpp  (working copy)
> @@ -20,6 +20,45 @@
>
>  using namespace clang;
>
> +static int SelectDigraphErrorMessage(tok::TokenKind Kind) {
> +  switch (Kind) {
> +    case tok::kw_template:         return 0;
> +    case tok::kw_const_cast:       return 1;
> +    case tok::kw_dynamic_cast:     return 2;
> +    case tok::kw_reinterpret_cast: return 3;
> +    case tok::kw_static_cast:      return 4;
> +    default:
> +      assert(0 && "Unknown type for digraph error message.");
> +      return -1;
> +  }
> +}
> +
> +// Suggest fixit for "<::" after a cast.
> +static void FixDigraph(Parser &P, Preprocessor &PP, Token &DigraphToken,
> +                       Token &ColonToken, tok::TokenKind Kind, bool
> AtDigraph) {
> +  // Pull '<:' and ':' off token stream.
> +  if (!AtDigraph)
> +    PP.Lex(DigraphToken);
> +  PP.Lex(ColonToken);
> +
> +  SourceRange Range;
> +  Range.setBegin(DigraphToken.getLocation());
> +  Range.setEnd(ColonToken.getLocation());
> +  P.Diag(DigraphToken.getLocation(), diag::err_missing_whitespace_digraph)
> +      << SelectDigraphErrorMessage(Kind)
> +      << FixItHint::CreateReplacement(Range, "< ::");
> +
> +  // Reuse tokens to preserve source location and macro instantiation
> +  // information.
> +  ColonToken.setKind(tok::coloncolon);
>
> The source location for this token will be off by one character. Later
> diagnostics could be confusing.

The source location would be off by 1, pointing to the second colon instead
of the first.  Would that cause confusion?

>
> +  DigraphToken.setKind(tok::less);
> +
> +  // Push new tokens back to token stream.
> +  PP.EnterToken(ColonToken);
> +  if (!AtDigraph)
> +    PP.EnterToken(DigraphToken);
> +}
> +
>  /// \brief Parse global scope or nested-name-specifier if present.
>  ///
>  /// Parses a C++ global scope specifier ('::') or nested-name-specifier
> (which
> @@ -287,6 +326,28 @@
>       continue;
>     }
>
> +    // Check for '<::' which should be '< ::' instead of '[:' when
> following
> +    // a template name.
> +    if (Next.is(tok::l_square) && Next.getLength() == 2) {
> +      Token SecondToken = GetLookAheadToken(2);
> +      if (SecondToken.is(tok::colon)) {
>
> This recovery codepath should only be entered if the '[' and ':' are
> adjacent in the source file.
>
I'll have a look into it. I'm guessing whitespace would throw off the
parser?

>
> +        TemplateTy Template;
> +        UnqualifiedId TemplateName;
> +        TemplateName.setIdentifier(&II, Tok.getLocation());
> +        bool MemberOfUnknownSpecialization;
> +        if (Actions.isTemplateName(getCurScope(), SS,
> +                                   /*hasTemplateKeyword=*/false,
> +                                   TemplateName,
> +                                   ObjectType,
> +                                   EnteringContext,
> +                                   Template,
> +                                   MemberOfUnknownSpecialization)) {
> +          FixDigraph(*this, PP, Next, SecondToken, tok::kw_template,
> +                     /*AtDigraph*/false);
> +        }
> +      }
> +    }
> +
>     // nested-name-specifier:
>     //   type-name '<'
>     if (Next.is(tok::less)) {
> @@ -453,6 +514,13 @@
>   SourceLocation OpLoc = ConsumeToken();
>   SourceLocation LAngleBracketLoc = Tok.getLocation();
>
> +  // Check for "<::" which is parsed as "[:".  If found, fix token stream,
> +  // diagnose error, suggest fix, and recover parsing.
> +  Token Next = NextToken();
> +  if (Tok.is(tok::l_square) && Tok.getLength() == 2 &&
> Next.is(tok::colon)) {
>
> Likewise here.
>
> +    FixDigraph(*this, PP, Tok, Next, Kind, /*AtDigraph*/true);
> +  }
> +
>   if (ExpectAndConsume(tok::less, diag::err_expected_less_after, CastName))
>     return ExprError();
>
>
> Incidentally, did you intentionally remove cfe-commits from CC?
>

No, I just clicked Reply instead of Reply-All.  They're like right next to
each other.

>
> Richard
>
> Other Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110404/5469b4f9/attachment.html>


More information about the cfe-commits mailing list