[cfe-dev] [PATCH] C++ decl/expr ambiguity resolution

Sat Aug 23 16:33:59 PDT 2008

On Sat, Aug 23, 2008 at 3:01 AM, Argiris Kirtzidis <akyrtzi at gmail.com> wrote:
> Eli Friedman wrote:
>>
>> On Fri, Aug 22, 2008 at 8:16 PM, Argiris Kirtzidis <akyrtzi at gmail.com>
>> wrote:
>>
>>>
>>> But exactly which parts of the statement should we examine to determine
>>> the
>>> validity of a declaration.. this is left undefined.
>>>
>>
>> Hmm... I see the following: "Note: To disambiguate, the whole
>> statement might have to be examined to determine if it is an
>> expression-statement or a declaration."  I think this is pretty clear;
>> if there were some allowable shortcuts, they would be documented in
>> the standard.
>>
>
> Yes but the "might have..." is pretty... ambiguous :)
> One might say,
> "by examining these parts of the statement I can conclude whether it's a
> declaration or expression"
> another one might say
> "these parts are not enough, I'll continue to examine the rest of the
> statement".
> The standard says "statement disambiguated as a declaration may be an
> ill-formed declaration", so the first one can say "with these parts of the
> statement I concluded it's a declaration, if the next parts do not compile
> as a declaration then it's an ill-formed declaration".
>
> If the standard said "the whole statement *should* be examined to
> determine.." it would be more clear. But this is very subtle and I may be
> wrong.

I agree that part of the standard isn't very well written.  That said,
for the cases that are supposed to parse as expressions, the ambiguity
resolution doesn't actually have any effect; the grammar isn't
ambiguous for those cases in the first place.  The grammar is just
very annoying for the parser. :)

Comeau seems to agree with my interpretation; it accepts all the
examples that you've given.

That said, as long as we accept everything that gcc accepts, it
probably doesn't particularly matter for anything besides
standards-compliance testsuites, so I wouldn't worry about it too
much.

> The issue is mostly how to deal with comma separated
> declarators/expressions:
>
> int(x), ++x;
>
> if we examine "++x" we will conclude that this is an expression. If we only
> examine "int(x)" we will regard this as declaration (note that the C++
> standard does not have such an example).
> GCC examines only the first declarator, while MSVC examines all of them.
> Given that comma separated declarators are *much* more common than comma
> separated expressions, and that a function-style cast followed by a comma is
> mostly useless, I chose to go along with GCC and only examine the first
> declarator.
>
> Another one:
>
> T(x) = 1, ++x;
>
> Again, if we examine "++x" we will conclude that this is an expression. Now
> both GCC and MSVC only take into account "T(x) = 1" and consider the above
> example as declaration; GCC is more consistent here.
>
> In general, one might say that it's a bit more consistent to treat
> statements as declarations by the first declarator:
>
> T x, ++x;   // this is malformed declaration
> T(x), ++x; // so is this
>
> What do you think about comma-separated declarators vs. comma-separated
> expressions ?

Hmm, well, I agree it's very unlikely that real-world code uses a
comma after a function-style constructor...

In any case, it's probably a good idea to emit a warning for code that
ends up falling into ambiguity resolution, i.e. anything that gets
past the call to TryParseDeclarationSpecifier from
TryParseSimpleDeclaration in your patch.  It's very likely that such
code doesn't do what the author intends, and it's usually very easy to
make the intention explicit (either by removing parentheses around the
variable declaration, or putting parentheses around the function-style
cast).

Comments on the patch follow:

+      while (Tok.is(tok::kw_const) || Tok.is(tok::kw_volatile))
+        ConsumeToken();

Doesn't the presence of const or volatile prove that this is a
declaration?  I don't think this actually affects the semantics, but
it seems cleaner to exit once we can prove something.

Also, shouldn't we be handling restrict here as well?  It's not
standard C++, but I think people use it.

+  if (Tok.is(tok::identifier)) {
+    // declarator-id
+    ConsumeToken();

Don't you need some sort of check to see whether this is a type?
Consider something like the following:
class T {public: T(int a) {} T() {} int operator*() {return 0;}};
void a() {T(*T());} // Must be expression

+  case tok::kw_const:
+  case tok::kw_volatile:

Again, handle restrict?

+Parser::TentativeParsingResult
+Parser::TryParseParameterDeclarationClause(SourceLocation LParenLoc) {
+  TentativeParsingResult TPR = TryParseDeclarationSpecifier();
+
+  if (TPR != TPR_false) {

What the heck is this if statement supposed to be testing for?  Cases
like empty parameter declarations?  This whole area looks rather
strange.  I haven't worked out whether this has any visible effects,
but it seems like it would be a lot clearer to explicitly check for
the cases where TryParseDeclarationSpecifier returns false, but there
is in fact a valid parameter list (the only cases I can think of are
an empty parameter list and a list with only "...").

Otherwise, I think it looks fine; that said, someone who knows the
parser better should review it as well.

-Eli