[PATCH] D127357: [pseudo] wip/prototype: use LR0 instead of SLR1 table

Thu Jun 9 02:01:27 PDT 2022

hokein added a comment.

Thanks for experimenting this!

> the size of the LR table is much smaller (In this patch we go from 340kB => 120kB, and the encoding isn't efficient)

This is interesting. I'd expect we now add more reduce actions into the LRTable, thus the table size should grow (I guess the saving is because we don't need to store different token kinds, instead just a single `eod` token)?

> we explore more dead-ends due to lack of lookahead, and parser runs slower
> with this patch, the LR0 table parses ~1/3 slower than the SLR1. (LR0 allows code simplifications so we may win some of this back)

(I also did some hack experiment yesterday, got similar results)
1/3 performance downgrade seems a lot... It is ok for error-recovery, but for correct code, it is an expensive and unnecessary pay.

Since this is a large change, before moving forward, maybe we should consider some alternatives:

1. use a hybrid solution (this also depends on how many interesting states we have)
  - for interesting states (where it has the sequence-element rule like `stmt := expression <dot>`), we build LR(0) for these states (like this patch);
  - for other non-interesting states, we keep building the SLR;
2. as we discussed yesterday, we find all available reduce actions during the error recovery by iterating all tokens. This is slower (but only on the error-recovery path), but I think there are ways to speed that up, e.g. using separate storage in the LRTable

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127357/new/

https://reviews.llvm.org/D127357