Taking into account Richard Smith's comments, I have made changes to the patch:<div>1) Check for whitespace and newlines between '<:' and ':'</div><div>2) Add test case for #1</div><div>3) Update SourceLocation for '::'<br>

<br><div class="gmail_quote">On Mon, Apr 4, 2011 at 5:19 PM, Richard Trieu <span dir="ltr"><<a href="mailto:rtrieu@google.com">rtrieu@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Sorry for the mess up, but I accidentally did a reply instead of reply-all.  This includes my response to Richard Smith, his reply to me, and my reply to his reply.<br><br><div class="gmail_quote"><div class="im">On Mon, Apr 4, 2011 at 4:39 PM, Richard Smith <span dir="ltr"><<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Mon, April 4, 2011 22:57, Richard Trieu wrote:<br>

> On Mon, Apr 4, 2011 at 1:19 PM, Richard Smith <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>><br>

> wrote:<br>

>> On Wed, 30 Mar 2011, Richard Trieu <<a href="mailto:rtrieu@google.com" target="_blank">rtrieu@google.com</a>> wrote:<br>

>>> Detect when the string "<::" is found in code after a cast or<br>

>>> template name and is interpreted as "[:" because of the digraph "<:".<br>

>>> When found,<br>

>>> give an error with a fix-it to add whitespace between the "<" and<br>

>>> "::".<br>

>>> Also, repair the token stream and recover parsing afterwards.<br>

>><br>

>> I don't think that this is the right fix. In C++0x, '<::' (not followed<br>

>> by ':' or '>') is lexed as '<' followed by '::' (see [lex.pptoken]p3<br>

>> bullet 2, or<br>

>> <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1104" target="_blank">http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1104</a>).<br>

>> Implementing that rule (for both C++0x and earlier) seems to me like a<br>

>> better fit, even though it breaks legal-but-pathological C++98 programs<br>

>> like this:<br>

>><br>

>> int a[42]; #define A(n) a<::##:n:><br>

>> int b = A(b);<br>

><br>

</div><div>> I don't think it's a good idea to break valid code.  Also, your link<br>

> shows that fix in the lexer is not yet in the standard.<br>

<br>

</div>Have a look at [lex.pptoken]p3 in N3242, the latest public draft of the<br>

C++0x standard. It's there already :)<br></blockquote></div><div>Ah, okay.  I see it now.  I thought that you had linked to a discussion of possible features.</div><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><br>

> Even if it were, it<br>

> should only be applied to C++0x with the parser handling this error in<br>

> other versions.<br>

<br>

</div>I don't have a particularly strong opinion on this. Personally, I'm much<br>

more interested in recovering nicely for code which accidentally contains<br>

'<::' and means '< ::' than I am in preserving the semantics of<br>

theoretical code where '<::' is supposed to be a digraph, but I'm not sure<br>

what's the best fit for clang's goals (my preference would be that this is<br>

an ExtWarn with a fixit outside of C++0x).<br>

<br>

If you still want to implement it this way, I have some comments on the<br>

patch itself:<br>

<br>

It would be nice to also cover the comparison case: '::a <::b'.</blockquote></div><div>Maybe later.  This patch was limited to prevent unforeseen impacts to other code. </div><div><div></div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Index: lib/Parse/ParseExprCXX.cpp<br>

===================================================================<br>

--- lib/Parse/ParseExprCXX.cpp  (revision 128572)<br>

+++ lib/Parse/ParseExprCXX.cpp  (working copy)<br>

@@ -20,6 +20,45 @@<br>

<br>

 using namespace clang;<br>

<br>

+static int SelectDigraphErrorMessage(tok::TokenKind Kind) {<br>

+  switch (Kind) {<br>

+    case tok::kw_template:         return 0;<br>

+    case tok::kw_const_cast:       return 1;<br>

+    case tok::kw_dynamic_cast:     return 2;<br>

+    case tok::kw_reinterpret_cast: return 3;<br>

+    case tok::kw_static_cast:      return 4;<br>

+    default:<br>

+      assert(0 && "Unknown type for digraph error message.");<br>

+      return -1;<br>

+  }<br>

+}<br>

+<br>

+// Suggest fixit for "<::" after a cast.<br>

+static void FixDigraph(Parser &P, Preprocessor &PP, Token &DigraphToken,<br>

+                       Token &ColonToken, tok::TokenKind Kind, bool<br>

AtDigraph) {<br>

+  // Pull '<:' and ':' off token stream.<br>

+  if (!AtDigraph)<br>

+    PP.Lex(DigraphToken);<br>

+  PP.Lex(ColonToken);<br>

+<br>

+  SourceRange Range;<br>

+  Range.setBegin(DigraphToken.getLocation());<br>

+  Range.setEnd(ColonToken.getLocation());<br>

+  P.Diag(DigraphToken.getLocation(), diag::err_missing_whitespace_digraph)<br>

+      << SelectDigraphErrorMessage(Kind)<br>

+      << FixItHint::CreateReplacement(Range, "< ::");<br>

+<br>

+  // Reuse tokens to preserve source location and macro instantiation<br>

+  // information.<br>

+  ColonToken.setKind(tok::coloncolon);<br>

<br>

The source location for this token will be off by one character. Later<br>

diagnostics could be confusing. </blockquote></div></div><div>The source location would be off by 1, pointing to the second colon instead of the first.  Would that cause confusion? </div><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

+  DigraphToken.setKind(tok::less);<br>

+<br>

+  // Push new tokens back to token stream.<br>

+  PP.EnterToken(ColonToken);<br>

+  if (!AtDigraph)<br>

+    PP.EnterToken(DigraphToken);<br>

+}<br>

+<br>

 /// \brief Parse global scope or nested-name-specifier if present.<br>

 ///<br>

 /// Parses a C++ global scope specifier ('::') or nested-name-specifier<br>

(which<br>

@@ -287,6 +326,28 @@<br>

       continue;<br>

     }<br>

<br>

+    // Check for '<::' which should be '< ::' instead of '[:' when following<br>

+    // a template name.<br>

+    if (Next.is(tok::l_square) && Next.getLength() == 2) {<br>

+      Token SecondToken = GetLookAheadToken(2);<br>

+      if (SecondToken.is(tok::colon)) {<br>

<br>

This recovery codepath should only be entered if the '[' and ':' are<br>

adjacent in the source file.<br></blockquote></div><div>I'll have a look into it. I'm guessing whitespace would throw off the parser?</div><div><div></div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

+        TemplateTy Template;<br>

+        UnqualifiedId TemplateName;<br>

+        TemplateName.setIdentifier(&II, Tok.getLocation());<br>

+        bool MemberOfUnknownSpecialization;<br>

+        if (Actions.isTemplateName(getCurScope(), SS,<br>

+                                   /*hasTemplateKeyword=*/false,<br>

+                                   TemplateName,<br>

+                                   ObjectType,<br>

+                                   EnteringContext,<br>

+                                   Template,<br>

+                                   MemberOfUnknownSpecialization)) {<br>

+          FixDigraph(*this, PP, Next, SecondToken, tok::kw_template,<br>

+                     /*AtDigraph*/false);<br>

+        }<br>

+      }<br>

+    }<br>

+<br>

     // nested-name-specifier:<br>

     //   type-name '<'<br>

     if (Next.is(tok::less)) {<br>

@@ -453,6 +514,13 @@<br>

   SourceLocation OpLoc = ConsumeToken();<br>

   SourceLocation LAngleBracketLoc = Tok.getLocation();<br>

<br>

+  // Check for "<::" which is parsed as "[:".  If found, fix token stream,<br>

+  // diagnose error, suggest fix, and recover parsing.<br>

+  Token Next = NextToken();<br>

+  if (Tok.is(tok::l_square) && Tok.getLength() == 2 &&<br>

Next.is(tok::colon)) {<br>

<br>

Likewise here.<br>

<br>

+    FixDigraph(*this, PP, Tok, Next, Kind, /*AtDigraph*/true);<br>

+  }<br>

+<br>

   if (ExpectAndConsume(tok::less, diag::err_expected_less_after, CastName))<br>

     return ExprError();<br>

<br>

<br>

Incidentally, did you intentionally remove cfe-commits from CC?<br></blockquote><div><br></div></div></div><div>No, I just clicked Reply instead of Reply-All.  They're like right next to each other.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<font color="#888888"><br>

Richard<br>

<br>

</font></blockquote></div>Other Richard

</blockquote></div><br></div>