[cfe-dev] Hello

Argiris Kirtzidis akyrtzi at gmail.com
Sun Oct 5 06:43:51 PDT 2008

Hi Sebastian,

Sebastian Redl wrote:
> I had some thoughts of parsing the entire thing ahead of time and then 
> having the sema resolve all issues at once, but there is one test case 
> that really makes this impractical:
> typedef int foo;
> namespace abc { foo bar(); }
> foo::abc::bar()
> {
>  // ...
> }
> Thus, there is one statement of yours in the discussion that is wrong:
>> Anyway, I don't think we need to worry about whether it should be 
>> "A::B   ::C" or  "A  ::B::C" or whatever.
> We do have to worry. GCC parses the above example correctly as foo 
> ::abc::bar. It doesn't stumble if foo is in its own namespace or even 
> template class either. (In my opinion, that's really a defect in the 
> standard. Nor do I think that there is any program out there that 
> really relies on this.)

Interesting find.

> Consider also:
>> I think this is correct since the spec says nothing about "'::' 
>> associativity".
> :: is part of productions further down the tree than anything else, so 
> it binds more strongly than anything else. However, it only binds to 
> left-hand identifiers that actually name a namespace or class. 
> Otherwise, it's the global scope.

Hmm, the standard says at  3.4.3p1: "During the lookup for a name 
preceding the '::' scope resolution operator, object, function, and 
enumerator names are ignored. If the name found is not a class-name or 
namespace-name, the program is ill-formed".
It seems to me that '::' binds to left-hand identifiers and if the 
identifier is not a namespace or class, we can consider it an error. I 
can't find anything about resorting to the global scope when the 
identifier exists and it's not a class or namespace.
For comparison, both MSVC and Comeau report something like "error: name 
followed by '::' must be a class or namespace".

> It probably could still be done, but it would be very complex code - 
> the sema would have to report that there are two identifiers, it would 
> have to report the split position, and the parser would have to adjust 
> its state according to this new revelation. It would shift a job to 
> the sema that really is the parser's problem, and all that for 
> probably no performance gain at all.

I completely agree, I don't think it's worth it.

> Here's a nice pathological case. GCC is so confused by it that it 
> aborts processing. It doesn't even take any instantiations - the 
> definition is enough. (Of course, to actually be a definition of the 
> second template function, there would have to be a typename before the 
> B::A.)
> namespace A { template <typename B> B f(); }
> template <typename B> typename B::A f();
> template <typename B>
> B::A::f()
> {
>  return 0;
> }
> struct s { typedef int A; }
> void foo()
> {
>        f<int>();
>        f<s>();
> }
> It's not surprising that this confuses GCC. This matter is actually 
> unspecified in the 2003 standard, see DR215. DR215 is in fact very 
> important to this whole matter, as is DR125, because the resolutions 
> to these issues render the first example I've given invalid. The 
> resolution to 125 has been voted into the C++0x paper in 2004, the one 
> to 215 in 2007. Under these rules, the class-or-namespace-name 
> production disappears and is replaced by the more sensible 
> type-or-namespace-name, with qualified name lookup amended to fail if 
> the type name doesn't denote a class. (A type template parameter is a 
> type-name.)
> The question is whether we want to follow the broken rules of C++03, 
> as GCC does, or the updated rules of C++0x, which introduces a tiny 
> incompatibility to GCC.

As I already mentioned, it's not clear that even C++03 allows this.

> I think that we should forbid using anything but Sema as the C++ 
> Action. At least when we get to implementing templates, implementing 
> isTypeName and similar functions is such a burden that it just doesn't 
> make sense to use a different action, and there is no heuristic that 
> can guess at types vs non-types with any reasonable accuracy - at 
> least not without parsing ahead, and an Action can't do that.
> The alternative is moving all type analysis, including that from Sema, 
> to MinimalAction, and deriving Sema from MinimalAction. That would 
> seriously slow down MinimalAction for C, though, I think.
> Parser::isTokenStreamTypeName() - if I understand its purpose 
> correctly, eventually this function will have to distinguish between 
> types, templates, perhaps even concepts, and objects/functions. This 
> is a lot for a function whose name suggests a boolean distinction.

Yeah, Parser::isTokenStreamTypeName() wasn't such a good approach. 
Currently I'm leaning towards "annotation tokens". The way I see them 
working is like this:
-At various points in the parser, when nested-name is encountered, a 
Parser::ParseCXXScopeSpec will parse it and leave a "scope spec token" 
to the token stream.
-This "scope spec token" can be later used by passing a "CXXScopeTy*" 
object (that "scope spec token" will contain) to various Sema actions.

For example:
  namespace foo { unsigned bar(); }
  unsigned foo::bar(); #1

At #1, Parser::ParseDeclarationSpecifiers will:
-parse 'unsigned'
-parse 'foo::' (using Parser::ParseCXXScopeSpec) leaving a "scope spec 
token" to the token stream.
-Check whether "foo::bar" is a typename by calling Action.isTypeName 
passing 'bar' identifier along with a CXXScopeTy* object (which the 
"scope spec token" can provide)
-finish parsing declaration specifiers since "foo::bar" is not a type 
(the "scope spec token" is still the current token).

Later, Parser::ParseDeclarator can see that there's a "scope spec token" 
in the token stream and take it into account when considering the 
declarator identifier.
The net result will be that each ParseXXX function will view a token 
stream appropriate to the parsing context, and without having to 
're-parse' stuff (like re-parsing nested names).
The code will be simpler and more maintainable this way.

Let me know what you think.

> Does whatever final resolution to the name lookup issue has been 
> chosen handle reentrancy? Does the inner lookup work here?
> ns1::templname<ns2::typename>::objname

I don't know how exactly templates will work, but I think there won't be 
reentrancy issues.


More information about the cfe-dev mailing list