[cfe-dev] [PATCH] C++ nested-name-specifier (Parser)

Sat Aug 9 18:29:02 PDT 2008

Chris Lattner wrote:.
>
> However, I think I am starting to understand what you mean.  
> isDeclarationSpecifier() (for example) contains this code right now:
>
>   switch (Tok.getKind()) {
>   ...
>     // typedef-name
>   case tok::identifier:
>     return Actions.isTypeName(*Tok.getIdentifierInfo(), CurScope) != 0;
>
> This means that following my approach would take us back down the 
> route of having a "parse expression with leading identifier expression 
> already eaten" method, and things like that.  This is because you'd 
> have to do something like:
>
>
> case tok::identifier:
>   // eats the current identifier and related type stuff as a whole,
>   D = ConsumeAndResolveIdentifier();
>   if (Actions.isTypeName(D))
>     ... it's a decl spec ...
>   else
>     .. it's an 'identifier expression' ..
>
> Is that the problem?

Yes, exactly! I want to avoid the "leading stuff.." route.

>   It also means that isDeclarationSpecifier would need backtracking 
> and significant other stuff to work as a predicate.  In practice, this 
> means we'd want to refactor all callers to not use it like it does.

Here's another way to make it work, without any backtracking:
Either Parser or Sema, keeps a scope-spec state (like Parser keeps a 
'current scope' state and Sema keeps 'current DeclContext' state), which 
indicates the Decl that we need to do lookup into.
As you suggested, the Parser, calls actions during parsing of scope 
specifiers.
For "A::T<int>::":
-Parses 'A::' and calls ActOnNestedNameSpecifier for 'A' (updates 
scope-spec)
-Parses, instantiates, etc. 'T<int>' through action calls (updates 
scope-spec)

Sema actions like isTypeName and ActOnIdentifier, check the scope-spec 
to see whether it needs to lookup a name inside the scope-spec decl, or 
do a normal lookup.

Now, the questions is whether Sema should keep the scope-spec state, or 
whether the Parser should keep the scope-spec state and pass it along to 
the actions (isTypeName, etc.)
When I say passing the scope-spec to actions, I don't mean like the 
CXXScopeSpec in my patch which contained only parsed tokens, I mean 
passing the decl that represents the scope where names should be looked 
up into.

If the Sema keeps the scope-spec state, it is cleaner, but how will the 
scope-state be cleared in case of an error like "A:: ;" ?
Should the parser call an ActOnErrorAfterNestedName or something ?

>
>>> Is this approach achievable?  I'd prefer to avoid parsing and 
>>> rewinding unless absolutely necessary.
>>
>> The advantage of the rewinding part is that there's no need for the 
>> Parser or Sema to keep and look after a 'C++ scoping' state (a state 
>> that is not just local to functions, but part of the Parser/Sema class).
>
> I'm not sure I believe that.  How do you propose to handle more 
> complex things (like the T<int>::x case) in the future?  It seems that 
> we're going to need to invoke sema to do template instantiation and 
> other stuff at some point: passing a pre-parsed thing like this to 
> sema as one big action call seems difficult.

Yes, you are right that for templates, passing them as parsed tokens to 
one action is not a good idea.

>>
>> This also means that we can drop the rewinding part if the Parser or 
>> Sema keep and look after a 'C++ scoping' state.
>> If the Parser keeps the state, it will build and pass 'CXXScopeSpec' 
>> objects to action methods, (like in my patch).
>> If the Sema keeps the state, it will make error recovery awkward.
>
> I actually think that things are more awkward with having Sema work 
> this out.  Specifically, if you have: "A :: B :: C" you might have one 
> reference or it might be "A::B ::C" which can occur in a few places in 
> the grammar:
> http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elsa/doc/coloncolon.txt 
>
>
> If the parser just passed all of A/B/C to Sema, I'm not sure how it 
> would handle this.  It seems really easy if you invoke actions for 
> "A::" then "B" (which returns a type).  When the parser got a type, it 
> would naturally ratchet on and parse ::C as a qualified name.

But "B" type could contain "C" as nested type, in which case you may 
want "A::B::C".
Anyway, I don't think we need to worry about whether it should be 
"A::B   ::C" or  "A  ::B::C" or whatever.
GCC treats it as "A::B::C" always:

struct S {};
namespace A { S f(); }
S ::A::f();  // error: 'S::A' has not been declared

I think this is correct since the spec says nothing about "'::' 
associativity".

-Argiris