[cfe-dev] [PATCH] C++ nested-name-specifier (Parser)
Chris Lattner
clattner at apple.com
Sat Aug 9 17:18:39 PDT 2008
On Aug 9, 2008, at 4:49 PM, Argiris Kirtzidis wrote:
> And about the preprocessor changes: Is it ok to commit them, and is
> it more intuitive to change 'Rewind' to 'Backtrack' ?
Yes, this does look like a nice cleanup for the previous mechanism.
Please improve this comment to explain what the members are and what
the invariants are:
+ // Cached tokens state.
+ typedef std::vector<Token> CachedTokensTy;
+ CachedTokensTy CachedTokens;
+ CachedTokensTy::size_type CachedLexPos;
+ CachedTokensTy::size_type CachedRewindPos;
and rename Rewind -> Backtrack and commit it, thanks!
In the future, this will have to be generalized a bit. IIUC, C++ can
require nested backtracking in some cases. This would require a stack
of speculation. Also, we may eventually need warnings that are
emitted while speculating to be cached and discarded if the lookahead
is discarded.
> Chris Lattner wrote:
>>
>>> The attached 'pp-caching' patch, modifies Preprocessor::LookAhead
>>> (making it efficient enough to replace LookNext) and adds
>>> 'rewinding' functionality.
>>> That is, one can call Preprocessor::EnableRewindAtThisPos, lex and
>>> consume any number of tokens, and later call Preprocessor::Rewind
>>> to make the preprocessor re-lex the tokens that were previsouly
>>> lexed since EnableRewindAtThisPos was called.
>>
>> I think we will need this functionality (aka 'tentative parsing' or
>> 'backtracking') at some point, but I'm not sure why we need it for
>> this. As a prolog, I haven't dug into the part of the spec, so if
>> I'm missing something obvious, please be gentle on me :)
>
> This is not a requirement of the spec, it just simplifies the parser
> in a similar way to how introducing NextToken() simplified it.
Ok.
>> My basic understanding (again, based on intuition, not reading of
>> the spec) of scope resolution was that it could actually be
>> implemented without lookahead. My thought was that if you
>> see ::A::B or A::B that you start by looking up the ::A /A part and
>> decide what it references. when looking up A, consider if it
>> resolves to a namespace. My though would be that the parser
>> invokes an action to look up the decl corresponding to A.
>>
>> Options are that it could either be:
>> 1) typedef name
>> 2) variable/ivar/etc
>> 3) namespace
>> 4) struct
>> ...
>>
>> One ::A is looked up, based on what the action returns, my
>> assumption is that the parser would see the next :: and do a
>> subscope lookup. Basically it would parse "::B" and then invoke an
>> action to look up ::B in the decl returned by the ::A lookup. At
>> some point presumably, you end up with a type or a variable etc,
>> which we'd need to treat as the start of a declspec etc. One nice
>> thing about this is that it would not require the parser to build
>> up an explicit 'CXXScopeSpec' object, Sema could build it if it
>> wanted, but it wouldn't be required.
>
> As far as the spec is concerned, you basically want to do something
> like this:
>
> -You have "A::B"
> -Parse "A::", not just 'A'. This is because "A::" indicates nested-
> name-specifier, and you want to do a specialized name lookup for
> nested-names. This specialized lookup will only consider classes/
> namespaces and ignore the rest. For example:
>
> namespace A { int x; }
> void f() {
> int A;
> A::x = 0; // We don't want 'A' lookup to return 'int A' here, we
> want the namespace A.
> }
>
> -Now parse "B" and lookup 'B' in the scope of 'A'.
Ok. Conceptually, I don't see a problem with this. After consuming
the identifier, if it is followed by a colon, invoke a different
action to do the lookup.
However, I think I am starting to understand what you mean.
isDeclarationSpecifier() (for example) contains this code right now:
switch (Tok.getKind()) {
...
// typedef-name
case tok::identifier:
return Actions.isTypeName(*Tok.getIdentifierInfo(), CurScope) != 0;
This means that following my approach would take us back down the
route of having a "parse expression with leading identifier expression
already eaten" method, and things like that. This is because you'd
have to do something like:
case tok::identifier:
// eats the current identifier and related type stuff as a whole,
D = ConsumeAndResolveIdentifier();
if (Actions.isTypeName(D))
... it's a decl spec ...
else
.. it's an 'identifier expression' ..
Is that the problem? It also means that isDeclarationSpecifier would
need backtracking and significant other stuff to work as a predicate.
In practice, this means we'd want to refactor all callers to not use
it like it does.
>> Is this approach achievable? I'd prefer to avoid parsing and
>> rewinding unless absolutely necessary.
>
> The advantage of the rewinding part is that there's no need for the
> Parser or Sema to keep and look after a 'C++ scoping' state (a state
> that is not just local to functions, but part of the Parser/Sema
> class).
I'm not sure I believe that. How do you propose to handle more
complex things (like the T<int>::x case) in the future? It seems that
we're going to need to invoke sema to do template instantiation and
other stuff at some point: passing a pre-parsed thing like this to
sema as one big action call seems difficult.
>
> This also means that we can drop the rewinding part if the Parser or
> Sema keep and look after a 'C++ scoping' state.
> If the Parser keeps the state, it will build and pass 'CXXScopeSpec'
> objects to action methods, (like in my patch).
> If the Sema keeps the state, it will make error recovery awkward.
I actually think that things are more awkward with having Sema work
this out. Specifically, if you have: "A :: B :: C" you might have one
reference or it might be "A::B ::C" which can occur in a few places in
the grammar:
http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elsa/doc/coloncolon.txt
If the parser just passed all of A/B/C to Sema, I'm not sure how it
would handle this. It seems really easy if you invoke actions for
"A::" then "B" (which returns a type). When the parser got a type, it
would naturally ratchet on and parse ::C as a qualified name.
-Chris
More information about the cfe-dev
mailing list