[PATCH] D77392: [WIP][clangd] Make signatureHelp work with stale preambles

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 7 18:00:21 PDT 2020


sammccall added a comment.

This is the perfect feature to start with (doesn't actually need any location transforms, lets us drop TUScheduler features), so well done for finding that.

That said, high-level comments mostly about preamble patching in general rather than this particular case.



================
Comment at: clang-tools-extra/clangd/CodeComplete.cpp:1076
+  if (PatchAdditionalIncludes) {
+    for (const auto &Inc : newIncludes(
+             Input.Preamble.LexedIncludes,
----------------
This is going to be used in other places where we parse too, we'll want some abstraction to encapsulate the analysis, the tweaks to CompilerInvocation, and (for other uses) mapping of locations from the preamble file and injected regions.

As discussed offline, I think injecting by synthesizing a header snippet which is mapped into the VFS and then added to PreprocessorOpts.Includes is preferable to tweaking the CompilerInvocation directly:
 - it extends to arbitrary directives, e.g. there's no way today to inject an `#import` directive via CompilerInvocation, no control over `<>` vs `""`, ...
 - it's easier to debug as the injected region can easily be dumped as text
 - it's easier to for code to recognize locations in our injected region if we're not sharing it with e.g. `-D`s coming from the commandline
 - (I think) it lets us use the PresumedLocation mechanism to record the association of each injected line with the corresponding line in the mainfile (not sure if this is valuable if we also need to translate the other way)

So I'd probably suggest something like

```
class PreamblePatch { // ends up stored in ParsedAST
public:
  PreamblePatch(); // empty, used when no patching is done
  static PreamblePatch compute(const ParseInputs&);

  void apply(CompilerInvocation&); // adjusts PPOptions.{RemappedFileBuffers, Includes} to include injected header

  friend operator<<(raw_ostream&, const PreamblePatch &); // probably just dumps InjectedHeaderContent

  // later... not sure about signature/names here
  SourceLocation parsedToUserLoc(SourceLocation);
  SourceLocation userToParsedLoc(SourceLocation); // is this also needed?

private:
  std::string InjectedHeaderContent; // with #line directives, or at least comments 
  // some data structure to map between locations in the stale preamble and their user locations
  // some data structure to map between locations in the injected header and their user locations
};
```


================
Comment at: clang-tools-extra/clangd/Headers.cpp:237
 
+std::vector<Inclusion> getPreambleIncludes(llvm::StringRef Contents,
+                                           const LangOptions &LangOpts) {
----------------
The tradeoff between raw-lexing and running the actual preprocessor in SingleFileParseMode seems pretty interesting.
(I'm not suggesting any particular change here, but I think it's worth exploring at some point. If the raw lexing is "for now" let's call that out, if it's a decision let's justify it. Also just curious what you think at this point - I know we've discussed this in the past and I pushed for raw-lexing)

---

The SingleFileParseMode is more accurate:
 - it's probably much easier to handle more directives correctly
 - it will handle ifdefs correctly where possible without reading files
 - it will tell us what includes actually resolve to (I'm not sure if this is useful though, when you can inject an #include with the right spelling and check that later)
 - ??? maybe we can recognize bad directives that we want to avoid injecting for better results ???
 - we don't have to do weird stuff like track the raw-lexed includes in preambledata

But potentially expensive with IO:
 - it will stat all the files along the search path.
 - it will ::realpath all the files that were found

Possible workaround for stat along the search path: most of the time, the #include directives are going to refer to files that were already included in the previous preamble (i.e. the preamble we're trying to patch) so we (heuristically) know what they resolve to.
If we used the HeaderSearchInfo's alias map to redirect <foo.h> to </absolute/path/to/foo.h> then `#include <foo.h>` will just cost a single stat. And when our heuristic is wrong and <foo.h> resolves to something else, we'll just get an include-not-found, we can still add the #include.

More radical workaround: give it a completely null VFS so it can't do any IO. All includes will fail to resolve, but if we don't need their resolved path at this point we don't care, right?


================
Comment at: clang-tools-extra/clangd/Headers.h:198
 
+/// Gets the includes in the preamble section of the file by running lexer over
+/// \p Contents. Returned inclusions are sorted according to written filename.
----------------
I'm guessing the APIs we end up with probably belong more in Preamble.h rather than here, both thematically and in terms of layering


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77392/new/

https://reviews.llvm.org/D77392





More information about the cfe-commits mailing list