[PATCH] D96033: [clang-repl] Land initial infrastructure for incremental parsing

Vassil Vassilev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Feb 13 00:25:23 PST 2021


v.g.vassilev added inline comments.


================
Comment at: clang/include/clang/Frontend/FrontendAction.h:238
   /// objects, and run statistics and output file cleanup code.
-  void EndSourceFile();
+  virtual void EndSourceFile();
 
----------------
teemperor wrote:
> I think this and the change above requires updating `WrapperFrontendAction` to also forward these functions to the wrapped action. Otherwise wrapping actions would change their behaviour. See the other virtual functions there.
Done. Ideally I would like to avoid making these routines virtual. However, here we call `EndSourceFile` https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L952 and this shuts down some of the objects required by the IncrementalAction.

In fact, after some checking, it does not make sense to allow overriding `BeginSourceFile` -- it initializes state of the `Action` class and does not make sense to forward it for instance in the `WrapperFrontendAction`. In principle, clients may be interested in initialize differently but for clang-repl we only need the `EndSourceFile` to be virtual.

Alternatively, we could somehow make the Inputs to take an "external" iterator (https://github.com/llvm/llvm-project/blob/1f6ec3d08f75dba6c93c291bd92552b807736eb3/clang/lib/Frontend/CompilerInstance.cpp#L942) which will provide us with a hook to treat each input line as a separate Input file. AFAICT, that has two disadvantages: 1) we will initialize/finalize more state of various objects which can hinder performance; 2) each input file is considered as a separate translation unit (TU) which clashes with the concept we introduce of an ever growing single TU.


================
Comment at: clang/include/clang/Interpreter/Transaction.h:1
+//===--- Transaction.h - Incremental Compilation and Execution---*- C++ -*-===//
+//
----------------
teemperor wrote:
> v.g.vassilev wrote:
> > teemperor wrote:
> > > Could this whole file just be part of `IncrementalParser.h` which is the only user ? `clang::Transaction` seems anyway a bit of a generic name, so maybe this could become `clang::IncrementalParser::Transaction` then.
> > The intent is to expose the Transaction class and if I move it in the `IncrementalParser` I will have to expose the `IncrementalParser.h`. Would it make sense to move it as part of the Interpreter object? Eg. `clang::Interpreter::Transaction`?
> Makes sense. I guess moving this into the Interpreter would create some strange dependencies, so I would say we keep it in its own file. Maybe we could put it into a `interpreter` namespace or give it a more unique name `InterpreterTransaction`?
> 
> I also think this should have some documentation.
I added some documentation.

Having a long name for the Transaction class will make the future code clunky. This class will be extensively used throughout the interpreter codebase and api. I'd be in favor of going for a namespace but then I think we will have confusing naming for say `clang::interpreter::Interpreter`...


================
Comment at: clang/lib/Interpreter/IncrementalExecutor.h:36
+  llvm::Error addModule(std::unique_ptr<llvm::Module> M);
+  llvm::Error runCtors() const;
+};
----------------
teemperor wrote:
> Should we maybe merge `runCtors` and `addModule`? Not sure if there is a use case for adding a Module but not running Ctors. Also documentation.
The case we have is when there is no JIT -- currently we have such a case in IncrementalProcessingTest I think. Another example, which will show up in future patches, is the registration of atexit handlers. That is, before we `runCtors` we make a pass over the LLVM IR and collect some specific details and (possibly change the IR and then run).

I'd rather keep it separate for now if that's okay.


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:63
+    if (CI.hasCodeCompletionConsumer())
+      CompletionConsumer = &CI.getCodeCompletionConsumer();
+
----------------
teemperor wrote:
> Can this completion code even be used? It doesn't look like it can (and I'm not sure if using the `CodeCompletionAt` flag is really useful in a REPL as you can only specify it once during startup). IMHO this can be left out until we actually can hook up this into the LineEditor (and we have a way to test this).
Cling, on which this patch is based on, has a CodeCompleteConsumer and it seems to work.

I can leave it out but then we will have to remember where to put it. I have a slight preference to leave it as it is.


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:69
+    Preprocessor &PP = CI.getPreprocessor();
+    PP.enableIncrementalProcessing();
+    PP.EnterMainSourceFile();
----------------
teemperor wrote:
> Could we do that before creating the Sema? Sema's constructor is doing quite a few things and some of them might depend on this setting in the future.
Indeed, should be fixed.


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:101
+  // later errors use the default handling behavior instead.
+  llvm::remove_fatal_error_handler();
+}
----------------
teemperor wrote:
> I think this should be in ClangRepl.cpp where we actually install the handler (I wouldn't expect that creating an `IncrementalParser` object would touch the global error handler).
Nice catch! Thanks!


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:136
+    Consumer->HandleTopLevelDecl(DGR);
+  }
+
----------------
teemperor wrote:
> Could we test this?
Not directly. A test case should be
```
#pragma weak f
void f() {}
```

This is tested by standard, we "just" record some decls in the vector. Do you have a specific test in mind?


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:165
+  std::ostringstream SourceName;
+  SourceName << "input_line_" << InputCount++;
+
----------------
teemperor wrote:
> I think this is OK for this patch, but if the file is called "input_line_" then Clang would still emit the actual line information (which is 1) and we get `input_line_1:1:9: error: ...` diagnostics. Since D83038 Clang can hide line numbers in diagnostics, so we could selectively turn this setting on when we generate diagnostics for `input_line` files.
> 
> (But as said, this is fine for this patch).
That is good to know. I will try to remember this and use it if needed.


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:189
+  // NewLoc only used for diags.
+  PP.EnterSourceFile(FID, /*DirLookup*/ 0, NewLoc);
+
----------------
teemperor wrote:
> I think we could assert that this succeeds.
assert on the FID you mean?


================
Comment at: clang/lib/Interpreter/IncrementalParser.cpp:195
+
+  if (PP.getLangOpts().DelayedTemplateParsing) {
+    // Microsoft-specific:
----------------
teemperor wrote:
> Could we throw in a test that at least covers this code path?
I can work on that but I'd be in favor of not holding up that patch due to this issue. I see two options:

  * remove the code -- that'd make it harder for cling to reuse this piece of code.
  * keep it as is -- the risk is that we have an untested codepath. 

Either way, that may be some bug in clang that we need to fix and drop these lines altogether...

That particular issue is tracked here https://github.com/root-project/root/pull/7078 (some of the related context is here https://paste.ubuntu.com/p/bM2VRgmqSG/ in a very obscure way :) )


================
Comment at: clang/lib/Interpreter/IncrementalParser.h:50
+  /// Incremental input counter.
+  unsigned InputCount = 0;
+
----------------
teemperor wrote:
> Maybe `Counts the number of direct user input lines that have been parsed.` or something like that. That InputCount counts inputs seems like redundant information.
Yeah, much nicer.


================
Comment at: clang/lib/Interpreter/Interpreter.cpp:82
+  PCHOps->registerWriter(std::make_unique<ObjectFilePCHContainerWriter>());
+  PCHOps->registerReader(std::make_unique<ObjectFilePCHContainerReader>());
+
----------------
teemperor wrote:
> Can you add a FIXME that this should be removed (as I think we keep adding these two lines to all kind of Clang tools)? I assume this is only needed because of Clang errors when you encounter -gmodules produced pcms ?
I do not understand the problem entirely, could you propose wording for the FIXME?


================
Comment at: clang/lib/Interpreter/Interpreter.cpp:43
+#include "llvm/Support/Host.h"
+//
+
----------------
teemperor wrote:
> v.g.vassilev wrote:
> > teemperor wrote:
> > > I think the comment here/above and some of the empty lines aren't really needed here.
> > Yeah, the include section surrounded with `//` is to keep track of the includes required by `IncrementalCompilerBuilder::create`. I am not sure where this code should live... 
> > 
> > Ideally, I'd like to have it in a place where it can be included by other clients which require compiler instance set up for incremental compilation.
> I see. I don't really see a better place for this (I guess this depends on what these other clients actually need such a CompilerInstance for). IMHO just having this in the normal includes is fine for now.
> 
> FWIW, I think some of these includes can go:
> 
> VerifyDiagnosticConsumer.h -> not used
> TextDiagnosticPrinter.h -> not used
> ObjectFilePCHContainerOperations.h -> only needed for the mysterious code below
> ErrorHandling.h -> not used
Let's keep it here for now and we can always move somewhere else. Fixing the includes. Thanks!


================
Comment at: clang/lib/Interpreter/Interpreter.cpp:140
+
+  if (std::find(ClangArgv.begin(), ClangArgv.end(), " -x") == ClangArgv.end()) {
+    // We do C++ by default; append right after argv[0] if no "-x" given
----------------
teemperor wrote:
> teemperor wrote:
> > `llvm::find(ClangArgv, " -x")`
> Sorry, I actually wanted to recommend `llvm::is_contained`, my bad!
No worries, fixed.


================
Comment at: clang/tools/clang-repl/CMakeLists.txt:3
+#  ${LLVM_TARGETS_TO_BUILD}
+#  Option
+  Support
----------------
teemperor wrote:
> v.g.vassilev wrote:
> > teemperor wrote:
> > > Commented out by mistake?
> > Does not seem to be required. I will remove it for now...
> I meant that shouldn't have been commented out (as I think we don't get the targets symbols from any other library so this fails to link)
It compiles just fine for me. Should we add it back once we find the setup that requires it?


================
Comment at: clang/tools/clang-repl/ClangRepl.cpp:59
+    ClangArgs.push_back("-Xclang");
+    ClangArgs.push_back("-emit-llvm-only");
+  }
----------------
teemperor wrote:
> This is probably worth a comment why this is needed. Also because it seems important and only gets added when the flags are empty, this probably means that setting*any* Clang arg (`-std=XYZ`) would cause troubles? I think this could just be unconditionally added (as the last flag has the highest priority over whatever the user passed in before).
Good point!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96033/new/

https://reviews.llvm.org/D96033



More information about the cfe-commits mailing list