[PATCH] D130550: [pseudo] Start rules are `_ := start-symbol EOF`, improve recovery.

Tue Jul 26 07:21:50 PDT 2022

sammccall added inline comments.

================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/GLR.h:74
     bool GCParity;
+    // Have we already used this node for error recovery? (prevents loops)
+    mutable bool Recovered = false;
----------------
hokein wrote:
> haven't look at it deeply -- is this bug related to this eof change? This looks like a different bug in recovery.
They're "related" in that they both fix repeated-recovery scenarios.
This change fixes that we can hit an infinite loop when applying recovery repeatedly.
The eof change fixes that recovery is (erroneously) only applied once at eof.

I hoped to cover them with the same testcase, which tests repeated recovery at EOF. I can extract this change with a separate test if you like, though it will be very similar to the one I have here.

================
Comment at: clang-tools-extra/pseudo/lib/Forest.cpp:191
+  // This is important to drive the final shift/recover/reduce loop.
+  new (&Terminals[Index])
+      ForestNode(ForestNode::Terminal, tokenSymbol(tok::eof),
----------------
hokein wrote:
> nit: in the underlying TokenStream implementation, `tokens()` has a trailing eof token, I think we can fold this into the above loop (if we expose a `token_eof()` method in TokenStream). Not sure we should do this. 
I think this doesn't generalize well... at the moment we're parsing the whole stream, but in future we likely want to parse a subrange (pp-disabled regions?). In such a case we would still want the terminating EOF terminal as a device for parsing, even though there's no corresponding token.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130550/new/

https://reviews.llvm.org/D130550