[PATCH] D125006: [pseudo] Support parsing variant target symbols.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon May 9 12:16:18 PDT 2022


sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Very nice!
And change to compiled grammar size is small.
Would including *every* nonterminal as a start symbol would blow the size up a bit?
This would eliminate some complexity in the interface.



================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/GLR.h:125
 };
-// Parse the given token stream with the GLR algorithm, and return a forest node
-// of the start symbol.
----------------
>From the caller's perspective, is "target symbol" really a clearer name than "start symbol"? The latter seems like more common terminology.


================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/GLR.h:126
+// Parses the given token stream as the Target symbol with the GLR algorithm,
+// and returns a forest node of the target symbol.
 //
----------------
// A rule `_ := Target` must exist for the chosen target symbol.


================
Comment at: clang-tools-extra/pseudo/lib/LRGraph.cpp:163
 LRGraph LRGraph::buildLR0(const Grammar &G) {
+  std::vector<std::pair<SymbolID, StateID>> StartStates;
   class Builder {
----------------
might be clearer just to make this a member of the builder, and add a `addStartState(SymbolID, StateID)` method?


================
Comment at: clang-tools-extra/pseudo/lib/LRTableBuild.cpp:43
 public:
+  Builder(std::vector<std::pair<SymbolID, StateID>> StartStates)
+      : StartStates(std::move(StartStates)) {}
----------------
nit: since we're ultimately copying this array to call the constructor, maybe just take ArrayRef here?


================
Comment at: clang-tools-extra/pseudo/lib/cxx.bnf:28
 #
-# _ serves as a "fake" start symbol, coming with real grammar symbols.
+# _ serves as a "fake" start symbol, coming with real grammar symbols which we
+# target to parse.
----------------
Maybe "_ lists all the start-symbols which we support parsing".


================
Comment at: clang-tools-extra/pseudo/tool/ClangPseudo.cpp:41
 static opt<bool> PrintForest("print-forest", desc("Print parse forest"));
+static opt<std::string> ParseSymbol("parse-symbol",
+                                    desc("specify the target symbol to parse"),
----------------
-start-symbol might be more common


================
Comment at: clang-tools-extra/pseudo/tool/ClangPseudo.cpp:57
+static clang::pseudo::SymbolID
+findNonterminalID(llvm::StringRef Name, const clang::pseudo::Grammar &G) {
+  for (clang::pseudo::SymbolID ID = 0; ID < G.table().Nonterminals.size(); ++ID)
----------------
nit: just findNonterminal()? rather than echoing the type


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125006/new/

https://reviews.llvm.org/D125006



More information about the cfe-commits mailing list