[all-commits] [llvm/llvm-project] 312116: [pseudo] Add error-recovery framework & brace-base...

Sam McCall via All-commits all-commits at lists.llvm.org
Tue Jul 5 11:50:00 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 31211674889095299eb817d7be034fde23f7b00a
      https://github.com/llvm/llvm-project/commit/31211674889095299eb817d7be034fde23f7b00a
  Author: Sam McCall <sam.mccall at gmail.com>
  Date:   2022-07-05 (Tue, 05 Jul 2022)

  Changed paths:
    M clang-tools-extra/pseudo/include/clang-pseudo/GLR.h
    M clang-tools-extra/pseudo/include/clang-pseudo/grammar/Grammar.h
    M clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h
    M clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h
    M clang-tools-extra/pseudo/lib/GLR.cpp
    M clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
    M clang-tools-extra/pseudo/lib/grammar/GrammarBNF.cpp
    M clang-tools-extra/pseudo/lib/grammar/LRGraph.cpp
    M clang-tools-extra/pseudo/lib/grammar/LRTableBuild.cpp
    M clang-tools-extra/pseudo/test/cxx/empty-member-spec.cpp
    A clang-tools-extra/pseudo/test/cxx/recovery-init-list.cpp
    M clang-tools-extra/pseudo/unittests/GLRTest.cpp

  Log Message:
  -----------
  [pseudo] Add error-recovery framework & brace-based recovery

The idea is:

- a parse failure is detected when all heads die when trying to shift the next token
- we can recover by choosing a nonterminal we're partway through parsing, and
  determining where it ends through nonlocal means (e.g. matching brackets)
- we can find candidates by walking up the stack from the (ex-)heads
- the token range is defined using heuristics attached to grammar rules
- the unparsed region is represented in the forest by an Opaque node

This patch has the core GLR functionality.
It does not allow recovery heuristics to be attached as extensions to
the grammar, but rather infers a brace-based heuristic.

Expected followups:

- make recovery heuristics grammar extensions (depends on D127448)
- add recovery to our grammar for bracketed constructs and sequence nodes
- change the structure of our augmented `_ := start` rules to eliminate some
  special-cases in glrParse.
- (if I can work out how): avoid some spurious recovery cases described in comments

(Previously mistakenly committed as a0f4c10ae227a62c2a63611e64eba83f0ff0f577)

Differential Revision: https://reviews.llvm.org/D128486




More information about the All-commits mailing list