[PATCH] D114790: [syntax][pseudo] Add grammar facilities for the pseudo-parser
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Dec 21 02:58:49 PST 2021
sammccall added inline comments.
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:52
+// Terminal IDs correspond to the clang TokenKind enum.
+using SymbolID = uint16_t;
+// SymbolID is only 12 bits wide.
----------------
If we want strong types, we could use `enum class SymbolID : uint16_t {};`
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:74
+using RuleID = uint16_t;
+// RuleID is only 12 bits wide.
+static constexpr unsigned RuleBits = 12;
----------------
Maybe rather "there are at most 2^12 rules"
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:83
+
+ static constexpr unsigned SizeBits = 4;
+ static_assert(SizeBits + SymbolBits <= 16,
----------------
// Sequence can be at most 2^4 tokens long
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:110
+// Grammar that describes a programming language, C++ etc.
+// It parses BNF grammar rules, and is a building brick for constructing a
+// grammar-based parser.
----------------
nit: building block is more idiomatic
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:110
+// Grammar that describes a programming language, C++ etc.
+// It parses BNF grammar rules, and is a building brick for constructing a
+// grammar-based parser.
----------------
sammccall wrote:
> nit: building block is more idiomatic
Nit: it's rather the RuleSpec factories and Grammar::build (which are static) that parses BNF grammar.
The Grammar object itself doesn't have this responsibility.
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:133
+ static llvm::Expected<RuleSpec> parseRaw(llvm::StringRef Line);
+ // Exposed for testing.
+ // Inline all _opt symbols.
----------------
I think this should be explicit rather than implicit, and so not just "exposed for testing" but explicitly called
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:144
+
+ // Lookup non-terminal by name.
+ SymbolID lookupSymbol(llvm::StringRef Name) const;
----------------
The implementation also (inefficiently) supports looking up terminals. Intended?
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:154
+ // Dump symbol (terminal or non-terminal) name.
+ llvm::StringRef dumpSymbol(SymbolID) const;
+ std::string dumpRule(RuleID) const;
----------------
dump suggests an arbitrary debugging representation, but we rely on this being exactly the symbol name. `symbolName(SymbolID)`?
Maybe also doc what the names are (for terminals).
e.g. `Terminals have names like "," (kw_comma) or "OPERATOR" (kw_operator)`.
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:156
+ std::string dumpRule(RuleID) const;
+ std::string dumpRuleRecursively(RuleID) const;
+ // Dump all rules of the given nonterminal symbol.
----------------
in the prototype the "recursively" versions ended up being much less useful than I thought. You may want to drop the complexity
================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:171
+
+// Underlying storage of the Grammar.
+struct Table {
----------------
Are compiled transition tables etc expected to live here, or is this a purely descriptive version of teh grammar?
================
Comment at: clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp:48
+
+void computeNullability(Table &T) {
+ // A nonterminal is nullable if some rule producing it is nullable.
----------------
I don't think it makes sense to compute and store nullability if our plan is to build a parser that doesn't support nullable symbols.
Rather we would warn in eliminateOptional() or diagnose() if there are nullable rules. (This doesn't require computing the nullability of every rule)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114790/new/
https://reviews.llvm.org/D114790
More information about the cfe-commits
mailing list