[cfe-dev] Token lookahead without the preprocessor

Jordan Rose jordan_rose at apple.com
Tue Jun 26 09:00:31 PDT 2012


On Jun 25, 2012, at 9:01 PM, Chandler Carruth wrote:

> On Mon, Jun 25, 2012 at 8:51 PM, Jordan Rose <jordan_rose at apple.com> wrote:
>> Hi, all. I've been trying to come up with a useful recovery for this case (<rdar://problem/11602405> for Apple folks):
>> 
>> void foo();
>> {
>>        // note the spurious semicolon above
>> }
>> 
>> The trouble is, having a semicolon there is a perfectly good way to end a declaration. It's clear that if there's a brace on the next line, it was actually supposed to be a definition (because C/C++ don't have top-level braces). But we get in trouble in this case (from test/CodeGen/pragma-weak.c):
>> 
> So, forgive me as it is quite likely I'm missing something. I'm hoping you can explain, and help me better understand recovering in the parser (an area I've avoided to my misfortune i fear).
> 
> Why can't we parse the first line as a declaration, but keep track in the parser of the last declaration completed. Then, when we parse the '{', which we know to be an error case, look to see if:
> 
> - We just parsed a complete declaration, and
> - That declaration was a function declaration, and
> - It is a viable candidate for defining here
> 
> If these all hold, couldn't we suggest deleting the ';', and then spin up the parser logic to parse a definition, potentially attaching it to a synthesized extra declaration of the function? My hope would be that Clang could essentially reconstruct the necessary parser state after fully parsing the declaration and hitting the open curly in the normal parsing logic, but that may be unrealistic. =]

This seems fairly reasonable, but would require picking apart the current logic for parsing function definitions to allow this. That actually doesn't look so bad, at least not for top-level definitions, but it sounds like Richard's refactoring would be a better solution anyway. (It's fairly orthogonal but it would make this unnecessary. The slight difference is that it wouldn't allow recovery in the #pragma weak case, but that seems fair to me.)

Of course, I also still don't have too much experience with Lex/Parse, so I might be missing something as well. :-)

Jordan



More information about the cfe-dev mailing list