[PATCH] D128297: [pseudo] Track heads as GSS nodes, rather than as "pending actions".
Haojian Wu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 22 04:28:31 PDT 2022
hokein added a comment.
Thanks, this looks a reasonable change to me.
================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/GLR.h:140
+ const ForestNode &NextTok, const ParseParams &Params,
+ std::vector<const GSS::Node *> &NewHeads);
+// Applies available reductions on Heads, appending resulting heads to the list.
----------------
nit: change NewHeads to a pointer? it seems clearer that NewHeads is the output of the function from the caller side `glrShift(OldHeads, ..., &NewHeads)`.
I think it would be clearer if glrShift just returns NewHeads, but I understand we want to avoid temporary object for performance reasons.
================
Comment at: clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h:132
StateID getGoToState(StateID State, SymbolID Nonterminal) const;
+ llvm::Optional<StateID> getShiftState(StateID State, SymbolID Terminal) const;
----------------
the function signature (return type) of `getGotoState` and `getShiftState` is not symmetry, it is fine and by design (we never all getGoToState on an invalid Nonterminal, it is guaranteed by the LR parser, but we might call getShiftState on a dead head). It is probably worth a comment explicitly specifying the difference.
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:67
+ G.symbolName(Terminals[I].symbol()), Terminals[I].symbol()));
+ glrShift(Heads, Terminals[I], Params, NextHeads);
+ std::swap(Heads, NextHeads);
----------------
I can see the benefit of keep them tight in the loop (and doing a shift before reduce), but I found the code is a bit confusing, it took me a while to understand.
- the concept `Heads` and `NextHeads` are different for glrShift, and glrReduce -- The NextHeads returned by the glrShift should be the Heads used in glrReduce, so I was confused when reading `glrReduce(Heads)` the code at first glance -- until I realized that there is a `swap(Heads, NextHeads)` on L68...
- it seems a little weird that at the end of the loop, we do a cleanup on NextHeads directly (I would image there is a `swap(Head, NextHead)` and then cleanup)
Instead of naming the two lists as `Heads` and `NewHeads`, how about naming them `HeadsForShift` (or `ShiftHeads`) and `HeadsForReduce`? the code would look like
```
for ( ... ) {
glrShift(HeadsForShift, ..., &HeadsForReduce);
....
glrReduce(HeadsForReduce, ..., &HeadsForShift);
}
```
It looks clearer to me.
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:69
+ std::swap(Heads, NextHeads);
+ SymbolID Lookahead = I + 1 == Terminals.size() ? tokenSymbol(tok::eof)
+ : Terminals[I + 1].symbol();
----------------
I'd probably leave a blank line deliberately after glrShift, because the glrReduce work on the *next* I+1 token.
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:111
+ const ForestNode &NewTok, const ParseParams &Params,
+ std::vector<const GSS::Node *> &NewHeads) {
assert(NewTok.kind() == ForestNode::Terminal);
----------------
nit: add a `NewHeads.empty` assertion?
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:200
// 0--5(pointer) // 5 is goto(0, pointer)
-void glrReduce(std::vector<ParseStep> &PendingReduce, const ParseParams &Params,
- NewHeadCallback NewHeadCB) {
+void glrReduce(std::vector<const GSS::Node *> &Heads, SymbolID Lookahead,
+ const ParseParams &Params) {
----------------
IIRC, in glrReduce, we only append newly-created GSS nodes in Heads, and never to do deleting, so the Heads will end up with lots of unnecessary heads (heads created for reduces), and we will call `getShiftState` on them.
We know that after glrReduce, active heads are heads with available shift actions, so we can optimize it further, we could just use a vector<GSS::Nodes> which just contains active heads , this could be done in the popPending. I think it will increase the performance by reducing the number of unnecessary heads.
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:264
+ // PoppedHeads is our position within it.
+ unsigned PoppedHeads = 0;
// Pop walks up the parent chain(s) for a reduction from Head by to Rule.
----------------
might be name it PoppedHeadIndex?
================
Comment at: clang-tools-extra/pseudo/lib/GLR.cpp:360
// Invoking the callback for new heads, a real GLR parser may add new
// reduces to the PendingReduce queue!
----------------
the comment is stale now.
================
Comment at: clang-tools-extra/pseudo/lib/grammar/LRTable.cpp:78
+ assert(pseudo::isToken(Terminal) && "expected terminal symbol!");
+ for (const auto &Result : find(State, Terminal))
+ if (Result.kind() == Action::Shift)
----------------
instead of using find directly, just use `getActions()`.
================
Comment at: clang-tools-extra/pseudo/lib/grammar/LRTable.cpp:79
+ for (const auto &Result : find(State, Terminal))
+ if (Result.kind() == Action::Shift)
+ return Result.getShiftState();
----------------
nit: it is worth a comment saying that if there is a shift action, it must be exactly 1, this is guaranteed by the LR parser (no shift-shift conflict)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128297/new/
https://reviews.llvm.org/D128297
More information about the cfe-commits
mailing list