[PATCH] D122303: [pseudo] Sort nonterminals based on their reduction order.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 23 06:34:19 PDT 2022


sammccall added a comment.

I think my main suggestion is to sort the symbols, not just the rules.
Having the rules in blocks grouped by the symbol they produce, but not in symbol order, seems confusing to reason about.

The extraction of TestGrammar seems unrelated to the rest of the patch and it *isn't* actually shared between tests yet, which makes it a bit hard to review.



================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/Grammar.h:168
 
-  // The rules are sorted (and thus grouped) by target symbol.
+  // The rules are topological sorted (and thus grouped) by the target symbol.
+  //
----------------
I think it's easier to understand if we sort the *symbols* and leave this comment as-is.
It feels a bit strange to sort the symbols alphabetically but sort the rules by some other property of the symbols.

It's also important to describe what relation we're sorting by!

I'd say "The symbols are topologically sorted: if `S := T` then S has a higher SymbolID than T."


================
Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:55
     };
     for (const auto &Spec : Specs) {
       Consider(Spec.Target);
----------------
You could identify dependencies here, and then use them to sort the uniquenonterminals before  allocating SymbolIDs


================
Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:122
+  std::vector<unsigned> getTopologicalOrder(GrammarTable *T) {
+    llvm::DenseMap<SymbolID, llvm::DenseSet<SymbolID>> DependencyGraph;
+    for (const auto &Rule : T->Rules) {
----------------
The map seems unneccesary - just a vector<pair<SymbolID, SymbolID>> Dependencies, and sort them?


================
Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:135
+    };
+    std::vector<SymbolID> VisitStates(T->Nonterminals.size(), NotVisited);
+    std::function<void(SymbolID)> DFS = [&](SymbolID SID) -> void {
----------------
(is a vector of symbol id, or of states?)


================
Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:18
+
+struct TestGrammar {
+  static TestGrammar build(llvm::StringRef BNF);
----------------
doc


================
Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:26
+  // The nonterminal symbo is expected to have a single rule in the grammar.
+  RuleID singleRuleFor(llvm::StringRef NonterminalName) const;
+
----------------
I think this puts too much to much emphasis on the "single" and breaks the symmetry with "symbol" - i'd prefer just "ruleFor".

(The failure mode here is just that the test crashes 100% of the time, so the risk seems low)


================
Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:28
+
+  std::unique_ptr<Grammar> G;
+  std::vector<std::string> Diags;
----------------
shouldn't these be private, or have better names, and docs when they're set etc?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122303/new/

https://reviews.llvm.org/D122303



More information about the cfe-commits mailing list