[PATCH] D127284: [clang-repl] Support statements on global scope in incremental mode.
Richard Smith - zygoloid via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Nov 15 23:40:07 PST 2022
rsmith added inline comments.
================
Comment at: clang/lib/AST/Decl.cpp:5264
+
+FunctionDecl *TopLevelStmtDecl::getOrConvertToFunction() {
+ if (FD)
----------------
I would hope that we can remove this. Instead, I think we can teach `CodeGen` to emit a sequence of `TopLevelStmtDecl`s directly as an LLVM IR function -- if it's not emitted anything else nor flushed its IR output since it last emitted a `TopLevelStmtDecl`, then reuse and extend the previous `Function`, otherwise create a new one. That would also allow us to make `TopLevelStmtDecl` model exactly one `Stmt`, which seems cleaner.
================
Comment at: clang/lib/Parse/ParseTentative.cpp:52-53
+ assert(getLangOpts().CPlusPlus && "Must be called for C++ only.");
+ if (DisambiguatingWithExpression) {
+ if (Tok.is(tok::identifier)) {
+ RevertingTentativeParsingAction TPA(*this);
----------------
Can we sink this into the `switch` on the token kind below?
================
Comment at: clang/lib/Parse/Parser.cpp:1033
+ !isDeclarationStatement(/*DisambiguatingWithExpression=*/true))
+ SingleDecl = ParseTopLevelStmtDecl();
+
----------------
v.g.vassilev wrote:
> There is a remaining challenge which probably could be addressed outside of this patch.
>
> Consider this statement block:
> ```
> int i = 12;
> ++i;
> i--;
>
> template<typename T> struct A { };
> ```
>
> Ideally we should model `++i; i--;` as a single `TopLevelStmtDecl` as the statement block is contiguous. That would require the creation of 2 AST nodes per block (one for the `TopLevelStmtDecl` and one for its conversion to `FunctionDecl`). This will give us also a nice property on the REPL side where the user could decide to squash multiple statements into a statement block to save on memory.
>
> To do so, we will need to use `isDeclarationStatement` as a stop rule in `ParseTopLevelDecl`. In turn, this would mean that we should duplicate all of the switch cases described in the `ParseExternalDeclaration` function here. [We need teach `isDeclarationStatement` everything we know about declarations, eg. it must tell us to stop when we see definition `struct A`].
>
> The last version of this patch goes in the opposite direction, trying to minimize the code duplication (bloat?) by wrapping each global statement into a `TopLevelStmtDecl`, reusing the logic in `ParseExternalDeclaration`. However, we pay the price for 2 AST node allocations per global statement. That is a serious hit for people that want to control the parsing granularity of an interpreter.
>
> I wonder if we can do something better hitting both requirements in some smart way I cannot see...
It seems to me that the big cost here is creating a `FunctionDecl` and all of its ancillary components; a `TopLevelStmtDecl` is pretty cheap. I don't think it should be necessary to create that `FunctionDecl` at all -- we should be able to go straight from `TopLevelStmtDecl` to an IR function like we go straight from a `VarDecl` for a global function to its initializer IR function without creating a `FunctionDecl`.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127284/new/
https://reviews.llvm.org/D127284
More information about the cfe-commits
mailing list