[PATCH] D118990: [pseudo] Add first and follow set computation in Grammar.
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Feb 4 08:46:58 PST 2022
sammccall accepted this revision.
sammccall added inline comments.
This revision is now accepted and ready to land.
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:146
+// derived from X. (Known as FIRST sets in grammar-based parsers).
+std::vector<llvm::DenseSet<SymbolID>> firstSets(const Grammar &);
+// For each nonterminal X, computes the set of terminals that could immediately
----------------
AFAICS, firstSets are iterated over in followSets, and followSets are iterated over when building SLR.
Seems like exposing `vector<vector<SymbolID>>` would be enough. Up to you.
(Or the `Epochs<SymbolID>` data structure from the prototype - there must be a name for that data structure)
================
Comment at: clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp:132
+ for (const auto &R : G.table().Rules) {
+ // Rule 2: for a rule X := ...YZ, we add all symbols from FIRST(Z) to
+ // FOLLOW(Y).
----------------
nit: `X := ... Y Z`
(no confusion that the ... is associated with Y or that YZ is one symbol)
================
Comment at: clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp:146
+ }
+ // Rule 3: for a rule X := ...Z, we add all symbols from FOLLOW(X) to
+ // FOLLOW(Z).
----------------
`X := ... Z`
================
Comment at: clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp:156
+
+SymbolID Grammar::startSymbol() const {
+ // start symbol is named _, binary search it.
----------------
Seems more natural to catch this problem in the constructor?
Then we'd stash the SymbolID in a member variable and just return it here.
================
Comment at: clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp:25
MATCHER_P(TargetID, SID, "") { return arg.Target == SID; }
template <typename... T> testing::Matcher<const Rule &> Sequence(T... IDs) {
----------------
whoops, I missed these in the previous review, these should really be `targetID` and `sequence` as they're functions.
================
Comment at: clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp:110
+ ASSERT_TRUE(Diags.empty());
+ auto ToPair = [](std::vector<llvm::DenseSet<SymbolID>> Input) {
+ std::vector<std::pair<SymbolID, llvm::DenseSet<SymbolID>>> Sets;
----------------
nit: ToPairs, plural
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118990/new/
https://reviews.llvm.org/D118990
More information about the cfe-commits
mailing list