[PATCH] D119172: [pseudo] Implement LRGraph

Wed Feb 9 01:59:32 PST 2022

hokein added inline comments.

================
Comment at: clang/include/clang/Tooling/Syntax/Pseudo/LRGraph.h:131
+
+  // Constructs an LR(0) automaton.
+  static LRGraph buildLR0(const Grammar &);
----------------
sammccall wrote:
> IIUC the data structure used here is inherently LR0, rather than the choice of how we build it.
> (Vs in the LRTable where we could come up with data representing a full LR(1) parser without changing the structure).
> 
> So I think this function doesn't need LR0 in its name (but the class could be LR0Graph if you like)?
At the moment we build an LR0, but the LRGraph itself is a generic data-structure, which could be used to model the LR(1) automaton (it is a matter how we define the states/nodes in the graph). Let's say in the future, we want to build a full LR(1), then we could easily add another method LRGraph::buildLR(1), so the current names  seem more reasonable to me.

================
Comment at: clang/lib/Tooling/Syntax/Pseudo/LRGraph.cpp:168
+  private:
+    // We store the hash codes, as collisions is rare and ItemSet is heavy.
+    // key is the hash value of **kernel** item sets.
----------------
sammccall wrote:
> This really feels like a dubious optimization to me as it's *incorrect* if we ever get a hash collision, and the kernel ItemSet is much smaller than `States` which we're already storing. This seems like it must only save 20% or something of overall size?
> 
> In the other review thread you compared the size of DenseMap<hash_code, size_t> vs DenseMap<ItemSet, size_t>, but really the total size including `States` seems like a more useful comparison.
ok, changed to `DenseMap<ItemSet, size_t>` instead. The size of `States` is ~215KB, so 15% `DenseMap<hash_code>` vs 28% for `DenseMap<ItemSet>`, not huge.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119172/new/

https://reviews.llvm.org/D119172