[PATCH] D130550: [pseudo] Start rules are `_ := start-symbol EOF`, improve recovery.
Haojian Wu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jul 29 01:27:31 PDT 2022
hokein added inline comments.
================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/GLR.h:74
bool GCParity;
+ // Have we already used this node for error recovery? (prevents loops)
+ mutable bool Recovered = false;
----------------
sammccall wrote:
> hokein wrote:
> > haven't look at it deeply -- is this bug related to this eof change? This looks like a different bug in recovery.
> They're "related" in that they both fix repeated-recovery scenarios.
> This change fixes that we can hit an infinite loop when applying recovery repeatedly.
> The eof change fixes that recovery is (erroneously) only applied once at eof.
>
> I hoped to cover them with the same testcase, which tests repeated recovery at EOF. I can extract this change with a separate test if you like, though it will be very similar to the one I have here.
> This change fixes that we can hit an infinite loop when applying recovery repeatedly.
I'm more worried about this bug, I think this is an important bug, worth a separate patch to fix it, right now it looks like a join-effort in the eof change.
> The eof change fixes that recovery is (erroneously) only applied once at eof.
Not sure I follow this. I think the eof change is basically to remove a technical debt (avoid the special case and repeated code after main parsing loop). Am I missing something?
================
Comment at: clang-tools-extra/pseudo/lib/Forest.cpp:191
+ // This is important to drive the final shift/recover/reduce loop.
+ new (&Terminals[Index])
+ ForestNode(ForestNode::Terminal, tokenSymbol(tok::eof),
----------------
sammccall wrote:
> hokein wrote:
> > nit: in the underlying TokenStream implementation, `tokens()` has a trailing eof token, I think we can fold this into the above loop (if we expose a `token_eof()` method in TokenStream). Not sure we should do this.
> I think this doesn't generalize well... at the moment we're parsing the whole stream, but in future we likely want to parse a subrange (pp-disabled regions?). In such a case we would still want the terminating EOF terminal as a device for parsing, even though there's no corresponding token.
oh, ok, that's fair enough.
================
Comment at: clang-tools-extra/pseudo/test/lr-build-basic.test:16
# GRAPH-NEXT: State 3
# GRAPH-NEXT: id := IDENTIFIER •
----------------
there should be a State 4 (with a `_ := expr EOF •` item)
================
Comment at: clang-tools-extra/pseudo/unittests/GLRTest.cpp:622
+ TestLang.Table = LRTable::buildSLR(TestLang.G);
+ llvm::errs() << TestLang.Table.dumpForTests(TestLang.G);
+ TestLang.RecoveryStrategies.try_emplace(
----------------
remove the debugging statement.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130550/new/
https://reviews.llvm.org/D130550
More information about the cfe-commits
mailing list