[cfe-commits] Patch: Add fix-it for digraph errors involving " <::"
Richard Trieu
rtrieu at google.com
Tue Apr 5 14:56:11 PDT 2011
Taking into account Richard Smith's comments, I have made changes to the
patch:
1) Check for whitespace and newlines between '<:' and ':'
2) Add test case for #1
3) Update SourceLocation for '::'
On Mon, Apr 4, 2011 at 5:19 PM, Richard Trieu <rtrieu at google.com> wrote:
> Sorry for the mess up, but I accidentally did a reply instead of reply-all.
> This includes my response to Richard Smith, his reply to me, and my reply
> to his reply.
>
> On Mon, Apr 4, 2011 at 4:39 PM, Richard Smith <richard at metafoo.co.uk>wrote:
>
>> On Mon, April 4, 2011 22:57, Richard Trieu wrote:
>> > On Mon, Apr 4, 2011 at 1:19 PM, Richard Smith <richard at metafoo.co.uk>
>> > wrote:
>> >> On Wed, 30 Mar 2011, Richard Trieu <rtrieu at google.com> wrote:
>> >>> Detect when the string "<::" is found in code after a cast or
>> >>> template name and is interpreted as "[:" because of the digraph "<:".
>> >>> When found,
>> >>> give an error with a fix-it to add whitespace between the "<" and
>> >>> "::".
>> >>> Also, repair the token stream and recover parsing afterwards.
>> >>
>> >> I don't think that this is the right fix. In C++0x, '<::' (not followed
>> >> by ':' or '>') is lexed as '<' followed by '::' (see [lex.pptoken]p3
>> >> bullet 2, or
>> >> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1104).
>> >> Implementing that rule (for both C++0x and earlier) seems to me like a
>> >> better fit, even though it breaks legal-but-pathological C++98 programs
>> >> like this:
>> >>
>> >> int a[42]; #define A(n) a<::##:n:>
>> >> int b = A(b);
>> >
>> > I don't think it's a good idea to break valid code. Also, your link
>> > shows that fix in the lexer is not yet in the standard.
>>
>> Have a look at [lex.pptoken]p3 in N3242, the latest public draft of the
>> C++0x standard. It's there already :)
>>
> Ah, okay. I see it now. I thought that you had linked to a discussion of
> possible features.
>
>>
>> > Even if it were, it
>> > should only be applied to C++0x with the parser handling this error in
>> > other versions.
>>
>> I don't have a particularly strong opinion on this. Personally, I'm much
>> more interested in recovering nicely for code which accidentally contains
>> '<::' and means '< ::' than I am in preserving the semantics of
>> theoretical code where '<::' is supposed to be a digraph, but I'm not sure
>> what's the best fit for clang's goals (my preference would be that this is
>> an ExtWarn with a fixit outside of C++0x).
>>
>> If you still want to implement it this way, I have some comments on the
>> patch itself:
>>
>> It would be nice to also cover the comparison case: '::a <::b'.
>
> Maybe later. This patch was limited to prevent unforeseen impacts to other
> code.
>
>>
>> Index: lib/Parse/ParseExprCXX.cpp
>> ===================================================================
>> --- lib/Parse/ParseExprCXX.cpp (revision 128572)
>> +++ lib/Parse/ParseExprCXX.cpp (working copy)
>> @@ -20,6 +20,45 @@
>>
>> using namespace clang;
>>
>> +static int SelectDigraphErrorMessage(tok::TokenKind Kind) {
>> + switch (Kind) {
>> + case tok::kw_template: return 0;
>> + case tok::kw_const_cast: return 1;
>> + case tok::kw_dynamic_cast: return 2;
>> + case tok::kw_reinterpret_cast: return 3;
>> + case tok::kw_static_cast: return 4;
>> + default:
>> + assert(0 && "Unknown type for digraph error message.");
>> + return -1;
>> + }
>> +}
>> +
>> +// Suggest fixit for "<::" after a cast.
>> +static void FixDigraph(Parser &P, Preprocessor &PP, Token &DigraphToken,
>> + Token &ColonToken, tok::TokenKind Kind, bool
>> AtDigraph) {
>> + // Pull '<:' and ':' off token stream.
>> + if (!AtDigraph)
>> + PP.Lex(DigraphToken);
>> + PP.Lex(ColonToken);
>> +
>> + SourceRange Range;
>> + Range.setBegin(DigraphToken.getLocation());
>> + Range.setEnd(ColonToken.getLocation());
>> + P.Diag(DigraphToken.getLocation(),
>> diag::err_missing_whitespace_digraph)
>> + << SelectDigraphErrorMessage(Kind)
>> + << FixItHint::CreateReplacement(Range, "< ::");
>> +
>> + // Reuse tokens to preserve source location and macro instantiation
>> + // information.
>> + ColonToken.setKind(tok::coloncolon);
>>
>> The source location for this token will be off by one character. Later
>> diagnostics could be confusing.
>
> The source location would be off by 1, pointing to the second colon instead
> of the first. Would that cause confusion?
>
>>
>> + DigraphToken.setKind(tok::less);
>> +
>> + // Push new tokens back to token stream.
>> + PP.EnterToken(ColonToken);
>> + if (!AtDigraph)
>> + PP.EnterToken(DigraphToken);
>> +}
>> +
>> /// \brief Parse global scope or nested-name-specifier if present.
>> ///
>> /// Parses a C++ global scope specifier ('::') or nested-name-specifier
>> (which
>> @@ -287,6 +326,28 @@
>> continue;
>> }
>>
>> + // Check for '<::' which should be '< ::' instead of '[:' when
>> following
>> + // a template name.
>> + if (Next.is(tok::l_square) && Next.getLength() == 2) {
>> + Token SecondToken = GetLookAheadToken(2);
>> + if (SecondToken.is(tok::colon)) {
>>
>> This recovery codepath should only be entered if the '[' and ':' are
>> adjacent in the source file.
>>
> I'll have a look into it. I'm guessing whitespace would throw off the
> parser?
>
>>
>> + TemplateTy Template;
>> + UnqualifiedId TemplateName;
>> + TemplateName.setIdentifier(&II, Tok.getLocation());
>> + bool MemberOfUnknownSpecialization;
>> + if (Actions.isTemplateName(getCurScope(), SS,
>> + /*hasTemplateKeyword=*/false,
>> + TemplateName,
>> + ObjectType,
>> + EnteringContext,
>> + Template,
>> + MemberOfUnknownSpecialization)) {
>> + FixDigraph(*this, PP, Next, SecondToken, tok::kw_template,
>> + /*AtDigraph*/false);
>> + }
>> + }
>> + }
>> +
>> // nested-name-specifier:
>> // type-name '<'
>> if (Next.is(tok::less)) {
>> @@ -453,6 +514,13 @@
>> SourceLocation OpLoc = ConsumeToken();
>> SourceLocation LAngleBracketLoc = Tok.getLocation();
>>
>> + // Check for "<::" which is parsed as "[:". If found, fix token
>> stream,
>> + // diagnose error, suggest fix, and recover parsing.
>> + Token Next = NextToken();
>> + if (Tok.is(tok::l_square) && Tok.getLength() == 2 &&
>> Next.is(tok::colon)) {
>>
>> Likewise here.
>>
>> + FixDigraph(*this, PP, Tok, Next, Kind, /*AtDigraph*/true);
>> + }
>> +
>> if (ExpectAndConsume(tok::less, diag::err_expected_less_after,
>> CastName))
>> return ExprError();
>>
>>
>> Incidentally, did you intentionally remove cfe-commits from CC?
>>
>
> No, I just clicked Reply instead of Reply-All. They're like right next to
> each other.
>
>>
>> Richard
>>
>> Other Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110405/3949fd37/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: digraph-error2.patch
Type: text/x-patch
Size: 6944 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110405/3949fd37/attachment.bin>
More information about the cfe-commits
mailing list