[PATCH] D118990: [pseudo] Add first and follow set computation in Grammar.

Wed Feb 9 00:15:29 PST 2022

hokein added inline comments.

================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:146
+// derived from X. (Known as FIRST sets in grammar-based parsers).
+std::vector<llvm::DenseSet<SymbolID>> firstSets(const Grammar &);
+// For each nonterminal X, computes the set of terminals that could immediately
----------------
sammccall wrote:
> AFAICS, firstSets are iterated over in followSets, and followSets are iterated over when building SLR.
> 
> Seems like exposing `vector<vector<SymbolID>>` would be enough. Up to you.
> (Or the `Epochs<SymbolID>` data structure from the prototype - there must be a name for that data structure)
that's right, vector<vector<...>> is enough, but it adds an extra step in the implementation (we need the set anyway for deduplication, and then convert to a vector), and it introduces some complexity in the followSet computation, we can't use a unified ExpandFollowSet (as the input could be a vector or llvm::Denseset) sigh.

================
Comment at: clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp:25

 MATCHER_P(TargetID, SID, "") { return arg.Target == SID; }
 template <typename... T> testing::Matcher<const Rule &> Sequence(T... IDs) {
----------------
sammccall wrote:
> whoops, I missed these in the previous review, these should really be `targetID` and `sequence` as they're functions.
they're public member variables :)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118990/new/

https://reviews.llvm.org/D118990