[PATCH] D118196: [syntax][pseudo] Implement LR parsing table.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 4 05:59:55 PST 2022


hokein marked 7 inline comments as done.
hokein added inline comments.


================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:130
+
+  // Returns the SymbolID of the augmented symbol '_'
+  SymbolID augmentedSymbol() const { return 0; }
----------------
sammccall wrote:
> this "augmented symbol" name doesn't seem widely used. (search for `lr "augmented symbol"` produces few results, mostly unrelated). It's also a confusing name, because it's IIUC it's not the symbol that's augmented, but rather the grammar has been augmented with the symbol...
> 
> I'd suggest just calling it the start symbol, unless the distinction is critical (I don't yet understand why it's important - I added it in the prototype just to avoid having the language-specific name "translation-unit" hardcoded).
> 
> Otherwise it's hard to suggest a good name without understanding what it's for, but it should be descriptive...
yeah, it is about augmented grammar. Strictly speaking, it is a start symbol of augmented symbol. Maybe we should not speak augmented stuff in the code, it is not a well-known term, and causes confusion. rename it to start symbol instead.

It looks like using augmented grammar is a "standard" LR technique,  it introduces a converged start state and accepting state in the LR graph, which makes it easier for the implementation. And it is a good fit for support multiple start symbols.


================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h:133
+
+  // Computes the FIRST(X) for all non-terminal symbol X.
+  std::vector<llvm::DenseSet<SymbolID>> firstSets() const;
----------------
sammccall wrote:
> sammccall wrote:
> > In about the same number of words you could explain what this actually is :-)
> > 
> > // Computes the set of terminals that could appear at the start of each non-terminal.
> is there some reason these functions need to be part of the grammar class?
> 
> They seem like part of the LR building to me, it could be local to LRBuilder.cpp, or exposed from LRTable.h or so for testing
as discussed, made them as free functions in Grammar.h. Separate the first and follow set changes to https://reviews.llvm.org/D118990, comments around them should be addressed there.


================
Comment at: clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp:88
+    for (const auto &R : T->Rules) {
+      // Rule 2: for a rule X := ...Yz, we add all symbols from FIRST(z) to
+      // FOLLOW(Y).
----------------
sammccall wrote:
> nit: `... Y Z`, with consistent spaces/capitalization
> Unless case is meant to imply something here?
Y was for nonterminal while z was for nonterminal or terminals. They are not super important, changed all to Uppercase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118196/new/

https://reviews.llvm.org/D118196



More information about the cfe-commits mailing list