[PATCH] D130460: [pseudo] Add recovery for declarations
Haojian Wu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jul 28 06:37:19 PDT 2022
hokein added inline comments.
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:267
+Token::Index recoverNextDeclaration(const RecoveryParams &P) {
+ if (P.Begin == P.Tokens.tokens().size())
----------------
I think this excludes the parameter-declaration (from the grammar spec, it is a different nonterminal).
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:267
+Token::Index recoverNextDeclaration(const RecoveryParams &P) {
+ if (P.Begin == P.Tokens.tokens().size())
----------------
hokein wrote:
> I think this excludes the parameter-declaration (from the grammar spec, it is a different nonterminal).
The logic of implementations looks very solid. I think it would be good to have some unittests for it.
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:282
+ // - if the braces are missing, we don't know where to recover
+ // So We only try to detect the others.
+ bool AcceptBraces = false;
----------------
if I read the code correctly, the following case should work like
```
void foo() {}
LLVM_DISCARD int bar() {} // this is an opaque `declaration` node?
void foo2() {} // it won't work if we change void -> Foo.
```
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:291
+ // they appear at the beginning of a line.
+ auto LikelyStartsDeclaration = [&](tok::TokenKind K) -> bool {
+ switch (K) {
----------------
the list looks reasonable -- I think we can lookup the `Follow(declaration)` set to find more.
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:301
+ case tok::kw_union:
+ // Types & other decl-specifiers
+ case tok::kw_void:
----------------
maybe consider `tok::identifier` (if the identifier is a type specifier, it is a good indicator of a declaration starting) as well? And yeah, we're less certain about it, but since we use the indentation information, I guess it might probably work well?
================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:352
+ }
+
+ // Semicolons terminate statements.
----------------
I think we miss to check the return index >= Cursor here.
================
Comment at: clang-tools-extra/pseudo/lib/cxx/cxx.bnf:313
-declaration-seq := declaration
-declaration-seq := declaration-seq declaration
declaration := block-declaration
----------------
Just want to spell out, this causes a performance hit :(
(a pure change of this has a ~6% slowdown, 7.17 MB/s -> 6.7MB/s on SemaCodeComplete.cpp).
left-cursive grammar is more friendly to LR-like parsers, because the parser can do the reduction earlier (after a parsed a `declaration`, we can perform a `declaration-seq` reduce); while right-cursive grammar, the reduction starts from the last declaration (this means our GSS is more deeper).
> change decl-sequence to right-recursive to simplify this
I remembered we discussed this bit before, but I don't really remember the details now :(
================
Comment at: clang-tools-extra/pseudo/test/cxx/recovery-declarations.cpp:6
+
+int x[0];
+// CHECK: simple-declaration := decl-specifier-seq init-declarator-list ;
----------------
nit: add some tests to cover the initializer, e.g.
```
auto lambda = [] xxxx () xxx {};
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130460/new/
https://reviews.llvm.org/D130460
More information about the cfe-commits
mailing list