r224505 - Parse: Don't parse after the eof has been consumed

Richard Smith richard at metafoo.co.uk
Thu Dec 18 18:26:39 PST 2014


On Thu, Dec 18, 2014 at 6:15 PM, David Majnemer <david.majnemer at gmail.com>
wrote:
>
> On Thu, Dec 18, 2014 at 5:53 PM, Richard Smith <richard at metafoo.co.uk>
> wrote:
>>
>> On Thu, Dec 18, 2014 at 1:57 AM, David Majnemer <david.majnemer at gmail.com
>> > wrote:
>>>
>>> Author: majnemer
>>> Date: Thu Dec 18 03:57:31 2014
>>> New Revision: 224505
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=224505&view=rev
>>> Log:
>>> Parse: Don't parse after the eof has been consumed
>>>
>>> ParseCXXNonStaticMemberInitializer stashes away all the tokens for the
>>> initializer and an additional EOF token to denote where the initializer
>>> ends.  However, it is possible for ParseLexedMemberInitializer to get
>>> its hands on the "real" EOF token; since the two tokens are
>>> indistinguishable, we end up consuming the EOF and descend into madness.
>>>
>>> Instead, make it possible to tell which EOF token we are looking at.
>>>
>>> This fixes PR21872.
>>>
>>> Added:
>>>     cfe/trunk/test/Parser/PR21872.cpp
>>> Modified:
>>>     cfe/trunk/include/clang/Lex/Token.h
>>>     cfe/trunk/lib/Parse/ParseCXXInlineMethods.cpp
>>>
>>> Modified: cfe/trunk/include/clang/Lex/Token.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Lex/Token.h?rev=224505&r1=224504&r2=224505&view=diff
>>>
>>> ==============================================================================
>>> --- cfe/trunk/include/clang/Lex/Token.h (original)
>>> +++ cfe/trunk/include/clang/Lex/Token.h Thu Dec 18 03:57:31 2014
>>> @@ -23,6 +23,7 @@
>>>
>>>  namespace clang {
>>>
>>> +class Decl;
>>>
>>
>> This is a layering violation; Lex should have no knowledge of AST-level
>> entities like Decls. Instead, just use an opaque void* pointer here and do
>> the casting in the Parser.
>>
>
> Fixed in r224562.
>
>
>>
>>
>>>  class IdentifierInfo;
>>>
>>>  /// Token - This structure provides full information about a lexed
>>> token.
>>> @@ -58,6 +59,8 @@ class Token {
>>>    ///    may be dirty (have trigraphs / escaped newlines).
>>>    ///  Annotations (resolved type names, C++ scopes, etc):
>>> isAnnotation().
>>>    ///    This is a pointer to sema-specific data for the annotation
>>> token.
>>> +  ///  Eof:
>>> +  //     This is a pointer to a Decl.
>>>    ///  Other:
>>>    ///    This is null.
>>>    void *PtrData;
>>> @@ -164,12 +167,23 @@ public:
>>>      assert(!isAnnotation() &&
>>>             "getIdentifierInfo() on an annotation token!");
>>>      if (isLiteral()) return nullptr;
>>> +    if (is(tok::eof)) return nullptr;
>>>
>>
>> Maybe assert on this case rather than adding another branch to this
>> function? (I'd expect it to be relatively hot.)
>>
>
> Asserting that it's not eof gives me over a thousand failures in the test
> suite.  I'll try to dig into that later.
>

Hmm, this is probably being called from places in the lexer where we know
we can't still have a raw_identifier and can't yet have an annotation
token. OK then, hopefully someone will shout if they can measure a
performance regression :) The isLiteral check is already pretty heavy.


>      return (IdentifierInfo*) PtrData;
>>>    }
>>>    void setIdentifierInfo(IdentifierInfo *II) {
>>>      PtrData = (void*) II;
>>>    }
>>>
>>> +  const Decl *getDecl() const {
>>> +    assert(is(tok::eof));
>>> +    return reinterpret_cast<const Decl *>(PtrData);
>>> +  }
>>> +  void setDecl(const Decl *D) {
>>> +    assert(is(tok::eof));
>>> +    assert(!PtrData);
>>> +    PtrData = const_cast<Decl *>(D);
>>> +  }
>>>
>>
>> Maybe setEofData?
>>
>
> Done in r224562.
>
>
>>
>>
>>> +
>>>    /// getRawIdentifier - For a raw identifier token (i.e., an identifier
>>>    /// lexed in raw mode), returns a reference to the text substring in
>>> the
>>>    /// buffer if known.
>>>
>>> Modified: cfe/trunk/lib/Parse/ParseCXXInlineMethods.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseCXXInlineMethods.cpp?rev=224505&r1=224504&r2=224505&view=diff
>>>
>>> ==============================================================================
>>> --- cfe/trunk/lib/Parse/ParseCXXInlineMethods.cpp (original)
>>> +++ cfe/trunk/lib/Parse/ParseCXXInlineMethods.cpp Thu Dec 18 03:57:31
>>> 2014
>>> @@ -218,6 +218,7 @@ void Parser::ParseCXXNonStaticMemberInit
>>>    Eof.startToken();
>>>    Eof.setKind(tok::eof);
>>>    Eof.setLocation(Tok.getLocation());
>>> +  Eof.setDecl(VarD);
>>>    Toks.push_back(Eof);
>>>  }
>>>
>>> @@ -622,7 +623,9 @@ void Parser::ParseLexedMemberInitializer
>>>      while (Tok.isNot(tok::eof))
>>>        ConsumeAnyToken();
>>>    }
>>> -  ConsumeAnyToken();
>>> +  // Make sure this is *our* artificial EOF token.
>>> +  if (Tok.getDecl() == MI.Field)
>>> +    ConsumeAnyToken();
>>>  }
>>>
>>>  /// ConsumeAndStoreUntil - Consume and store the token at the passed
>>> token
>>>
>>> Added: cfe/trunk/test/Parser/PR21872.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Parser/PR21872.cpp?rev=224505&view=auto
>>>
>>> ==============================================================================
>>> --- cfe/trunk/test/Parser/PR21872.cpp (added)
>>> +++ cfe/trunk/test/Parser/PR21872.cpp Thu Dec 18 03:57:31 2014
>>> @@ -0,0 +1,4 @@
>>> +// RUN: not %clang_cc1 -fsyntax-only %s
>>> +template <typename T> struct S {
>>> +    int k =
>>> ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
>>> +int f;
>>>
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20141218/3720cc39/attachment.html>


More information about the cfe-commits mailing list