[PATCH] D156277: [Parser][ObjC] Stop parsing on eof

John McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 26 12:31:18 PDT 2023


rjmccall added inline comments.


================
Comment at: clang/lib/Parse/ParseObjc.cpp:749
+      if (!Tok.is(tok::eof))
+        ConsumeToken();
       break;
----------------
aaron.ballman wrote:
> rjmccall wrote:
> > aaron.ballman wrote:
> > > aaron.ballman wrote:
> > > > danix800 wrote:
> > > > > tbaeder wrote:
> > > > > > Why is there a `ConsumeToken()` call at all here? The token is already being consumed in line 729.
> > > > > Didn't notice this, thanks for reminding!
> > > > I have the same question as @tbaeder -- what token is this intending to consume? CC @rjmccall for Obj-C expertise
> > > OH! This is consuming the identifier for the implementation/interface name itself. e.g.,
> > > ```
> > > @interface Frobble
> > > ```
> > > The consume on line 709 gets the `@`, the consume on line 729 gets the `interface`, and the consume on line 749 is getting the `Frobble`. That makes sense to me now.
> > > 
> > I don't think any language expertise is required here — just seems like a straightforward bug on an error path that's probably not exercised all that often.  Maybe somebody moved the `ConsumeToken` and forgot to fix this case or something.
> What concerns me about this fix is that we don't typically check whether the token is EOF or not before consuming; that's usually an anti-pattern, isn't it? Wouldn't it make sense for this to use `SkipUntil(tok::identifier)` instead?
Okay, so now I can bring a little language expertise to bear. :)

We're in the middle of parsing an ObjC block (e.g. `@interface`), and we see `@interface` or `@implementation`, which starts a new block.  You can never nest these ObjC blocks, so the parser is reasonably assuming that the second `@keyword` is an attempt to start a new block and the user just forgot to terminate the last block with `@end`.  Unfortunately, the actual recovery done by the parser doesn't seem to match the diagnostic and the fixit — it's trying to swallow `@interface Foo` (or whatever) and then continue the loop as if it were part of the current block, which is definitely not the right thing to do.

The right way to recover here is to act like we actually saw `@end` and break out of the loop, leaving `Tok` on the `@` so that the parser will pick up parsing `@interface` normally after we return.  To do that, we just need to get the ObjC keyword by peeking at the next token instead of consuming.

Also, we should take this recovery path on every `@` keyword that's only allowed at the top level (so `@class`, `@compatibility_alias`, `@interface`, `@implementation`, and `@protocol`).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156277/new/

https://reviews.llvm.org/D156277



More information about the cfe-commits mailing list