[PATCH] D156277: [Parser][ObjC] Stop parsing on eof

Ding Fei via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 27 10:23:28 PDT 2023


danix800 added inline comments.


================
Comment at: clang/lib/Parse/ParseObjc.cpp:749
+      if (!Tok.is(tok::eof))
+        ConsumeToken();
       break;
----------------
rjmccall wrote:
> aaron.ballman wrote:
> > rjmccall wrote:
> > > aaron.ballman wrote:
> > > > aaron.ballman wrote:
> > > > > danix800 wrote:
> > > > > > tbaeder wrote:
> > > > > > > Why is there a `ConsumeToken()` call at all here? The token is already being consumed in line 729.
> > > > > > Didn't notice this, thanks for reminding!
> > > > > I have the same question as @tbaeder -- what token is this intending to consume? CC @rjmccall for Obj-C expertise
> > > > OH! This is consuming the identifier for the implementation/interface name itself. e.g.,
> > > > ```
> > > > @interface Frobble
> > > > ```
> > > > The consume on line 709 gets the `@`, the consume on line 729 gets the `interface`, and the consume on line 749 is getting the `Frobble`. That makes sense to me now.
> > > > 
> > > I don't think any language expertise is required here — just seems like a straightforward bug on an error path that's probably not exercised all that often.  Maybe somebody moved the `ConsumeToken` and forgot to fix this case or something.
> > What concerns me about this fix is that we don't typically check whether the token is EOF or not before consuming; that's usually an anti-pattern, isn't it? Wouldn't it make sense for this to use `SkipUntil(tok::identifier)` instead?
> Okay, so now I can bring a little language expertise to bear. :)
> 
> We're in the middle of parsing an ObjC block (e.g. `@interface`), and we see `@interface` or `@implementation`, which starts a new block.  You can never nest these ObjC blocks, so the parser is reasonably assuming that the second `@keyword` is an attempt to start a new block and the user just forgot to terminate the last block with `@end`.  Unfortunately, the actual recovery done by the parser doesn't seem to match the diagnostic and the fixit — it's trying to swallow `@interface Foo` (or whatever) and then continue the loop as if it were part of the current block, which is definitely not the right thing to do.
> 
> The right way to recover here is to act like we actually saw `@end` and break out of the loop, leaving `Tok` on the `@` so that the parser will pick up parsing `@interface` normally after we return.  To do that, we just need to get the ObjC keyword by peeking at the next token instead of consuming.
> 
> Also, we should take this recovery path on every `@` keyword that's only allowed at the top level (so `@class`, `@compatibility_alias`, `@interface`, `@implementation`, and `@protocol`).
It's really great to learn things here! I don't know two much about ObjC. I seached google trying to find some standard or specs for ObjC but only docs like tutorials teaching how to use it can be found, so I might not be able to give a good enough fix for this issue. I'll give it a try though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156277/new/

https://reviews.llvm.org/D156277



More information about the cfe-commits mailing list