[PATCH] D127284: [clang-repl] Support statements on global scope in incremental mode.

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 6 08:09:41 PDT 2022


aaron.ballman added inline comments.


================
Comment at: clang/include/clang/AST/ASTConsumer.h:54-59
+  /// HandleTopLevelStmts - Handle the specified top-level statements. This is
+  /// called by the parser to process every top-level Stmt* in incremental
+  /// compilation mode.
+  ///
+  /// \returns true to continue parsing, or false to abort parsing.
+  virtual bool HandleTopLevelStmts(const llvm::SmallVectorImpl<Stmt *> &Stmts) {
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > In C and C++, the only thing that's at file scope (that the parser sees) are declarations, not statements, so the phrasing here seems off. I suspect this is because you don't *know* you're only going to get declarations (because it could be that we're trying to parse in the context of an imaginary function body)? If so, perhaps the comments could be updated to clarify that.
> I am a bit confused. Could you clarify what wording changes you had in mind?
It wasn't clear to me whether this was misnamed  (`HandleTopLevelDecls()`) or whether the comments didn't make it clear that this is actually "statements and/or declarations and we don't know whether they're at file scope or not."


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5107-5111
+  static unsigned id = 0;
+  ASTContext &C = getContext();
+  TranslationUnitDecl *TUDecl = C.getTranslationUnitDecl();
+
+  IdentifierInfo *name = &C.Idents.get("_stmts" + std::to_string(id++));
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > You should fix the other identifiers in this function to match the usual coding style.
> I have not paid too much attention in the coding style here as it was mostly for initiating a design discussion.
> 
> The body of this function breaks the design of clang which assumes AST to be created by the frontend and be essentially immutable in `CodeGen`. This `FunctionDecl` needs to be treated as a global initializer. That is, it needs to be run in a particular order to the other initializer following the lexical occurrence of the statement block. That leaves us with two options, either to expose the registration of global initializers outside of CodeGen, or we need to create that here.
> 
> Alternatively, we could make `HandleTopLevelStmts` take a `FunctionDecl` and we can create it in the `IncrementalParser`. That blurs the responsibility and coherence of the interfaces. We will have public interface `HandleTopLevelStmts` (or similar) which handles top-level statements but take a `FunctionDecl` as a parameter which is created by the user...
> I have not paid too much attention in the coding style here as it was mostly for initiating a design discussion.

No worries, so long as you run clang-format over the patch before landing, it'll be fine.

> either to expose the registration of global initializers outside of CodeGen, or we need to create that here.

Ideally, this should be generated outside of CodeGen. It seems to me that this could perhaps be done once the TU has finished but before codegen has started (in `ActOnEndOfTranslationUnit()` or `ActOnEndOfTranslationUnitFragment()`)?


================
Comment at: clang/lib/Parse/ParseTentative.cpp:54
+    case tok::semi:
+      return true; // Not a stmt but can be extra terminator in `namespace{};`
+    case tok::kw_template:
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > http://eel.is/c++draft/dcl.dcl#nt:empty-declaration
> IIUC, clang models the `;` as a `NullStmt`.
That depends on where the `;` is: https://godbolt.org/z/Kz9zjjKoe


================
Comment at: clang/lib/Parse/Parser.cpp:729-730
 
+  // FIXME: Remove the incremental processing pre-condition and verify clang
+  // still can pass its test suite, which will harden `isDeclarationStatement`.
+  // Parse a block of top-level-stmts.
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > How close are we to having this fixme fixed?
> 2000 tests away. The tests are mostly in OpenCL, OpenMP, etc, which just need adding more entries in places such as `isCXXDeclarationSpecifier`. I've decided to keep the patch small and then add that once we merge this one.
LOL, okay, so a ways away still. :-D


================
Comment at: clang/test/Interpreter/disambiguate-decl-stmt.cpp:28-33
+namespace Ns {namespace Ns { void Ns(); void Fs();}}
+void Ns::Ns::Ns() { printf("void Ns::Ns::Ns()\n"); }
+void Ns::Ns::Fs() {}
+
+Ns::Ns::Fs();
+Ns::Ns::Ns();
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > This behavior is very confusing to me -- you cannot declare a namespace at function scope and you cannot call a function at file scope, so one of these two seems like it should not work.
> > 
> > (Same kinds of confusion happen elsewhere in the tests.)
> The mental model has been that we have a mix of statements and declarations on the global scope. The sequence of statements are only considered to be as if they have been wrapped in a function body. For example:
> 
> ```
> namespace Ns {namespace Ns { void Ns(); void Fs();}}
> void Ns::Ns::Ns() { printf("void Ns::Ns::Ns()\n"); }
> void Ns::Ns::Fs() {}
> Ns::Ns::Fs();
> Ns::Ns::Ns();
> ```
> 
> should be semantically equivalent to:
> 
> ```
> namespace Ns {namespace Ns { void Ns(); void Fs();}}
> void Ns::Ns::Ns() { printf("void Ns::Ns::Ns()\n"); }
> void Ns::Ns::Fs() {}
> 
> void stmt_block1() {
>   Ns::Ns::Fs();
>   Ns::Ns::Ns();
> }
> auto _ = (stmt_block1(), 1);
> ```
> 
> Does that help in making it less confusing when thinking about it?
> The mental model has been that we have a mix of statements and declarations on the global scope.

I don't understand that mental model because neither C nor C++ allow you to have statements at the global scope. So I'm not certain how to think about statements showing up there (and more importantly, the languages and the compiler wouldn't expect that anyway). For example, consider this:
```
int n = 12;
int array[n];
```
This is valid at block scope and invalid at global scope: https://godbolt.org/z/8zxn3h79E

Similarly, consider this code:
```
template <typename Ty>
void func();
```
This is valid at file scope but invalid at block scope.

My concern here is that it will be a game of whack-a-mole to identify all of these circumstances and try to get the semantics somewhat close to correct. What is the harm with the REPL always being a full TU, so if the user wants a block scope, they're responsible for writing a function and calling it from `main()`?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127284/new/

https://reviews.llvm.org/D127284



More information about the cfe-commits mailing list