[PATCH] D77392: [WIP][clangd] Make signatureHelp work with stale preambles

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 9 13:07:38 PDT 2020


kadircet marked 3 inline comments as done.
kadircet added inline comments.


================
Comment at: clang-tools-extra/clangd/CodeComplete.cpp:1076
+  if (PatchAdditionalIncludes) {
+    for (const auto &Inc : newIncludes(
+             Input.Preamble.LexedIncludes,
----------------
sammccall wrote:
> This is going to be used in other places where we parse too, we'll want some abstraction to encapsulate the analysis, the tweaks to CompilerInvocation, and (for other uses) mapping of locations from the preamble file and injected regions.
> 
> As discussed offline, I think injecting by synthesizing a header snippet which is mapped into the VFS and then added to PreprocessorOpts.Includes is preferable to tweaking the CompilerInvocation directly:
>  - it extends to arbitrary directives, e.g. there's no way today to inject an `#import` directive via CompilerInvocation, no control over `<>` vs `""`, ...
>  - it's easier to debug as the injected region can easily be dumped as text
>  - it's easier to for code to recognize locations in our injected region if we're not sharing it with e.g. `-D`s coming from the commandline
>  - (I think) it lets us use the PresumedLocation mechanism to record the association of each injected line with the corresponding line in the mainfile (not sure if this is valuable if we also need to translate the other way)
> 
> So I'd probably suggest something like
> 
> ```
> class PreamblePatch { // ends up stored in ParsedAST
> public:
>   PreamblePatch(); // empty, used when no patching is done
>   static PreamblePatch compute(const ParseInputs&);
> 
>   void apply(CompilerInvocation&); // adjusts PPOptions.{RemappedFileBuffers, Includes} to include injected header
> 
>   friend operator<<(raw_ostream&, const PreamblePatch &); // probably just dumps InjectedHeaderContent
> 
>   // later... not sure about signature/names here
>   SourceLocation parsedToUserLoc(SourceLocation);
>   SourceLocation userToParsedLoc(SourceLocation); // is this also needed?
> 
> private:
>   std::string InjectedHeaderContent; // with #line directives, or at least comments 
>   // some data structure to map between locations in the stale preamble and their user locations
>   // some data structure to map between locations in the injected header and their user locations
> };
> ```
implemented an initial version of this, no sourcelocation handling yet.

compute also takes the preamble we are patching though.


================
Comment at: clang-tools-extra/clangd/Headers.cpp:237
 
+std::vector<Inclusion> getPreambleIncludes(llvm::StringRef Contents,
+                                           const LangOptions &LangOpts) {
----------------
sammccall wrote:
> The tradeoff between raw-lexing and running the actual preprocessor in SingleFileParseMode seems pretty interesting.
> (I'm not suggesting any particular change here, but I think it's worth exploring at some point. If the raw lexing is "for now" let's call that out, if it's a decision let's justify it. Also just curious what you think at this point - I know we've discussed this in the past and I pushed for raw-lexing)
> 
> ---
> 
> The SingleFileParseMode is more accurate:
>  - it's probably much easier to handle more directives correctly
>  - it will handle ifdefs correctly where possible without reading files
>  - it will tell us what includes actually resolve to (I'm not sure if this is useful though, when you can inject an #include with the right spelling and check that later)
>  - ??? maybe we can recognize bad directives that we want to avoid injecting for better results ???
>  - we don't have to do weird stuff like track the raw-lexed includes in preambledata
> 
> But potentially expensive with IO:
>  - it will stat all the files along the search path.
>  - it will ::realpath all the files that were found
> 
> Possible workaround for stat along the search path: most of the time, the #include directives are going to refer to files that were already included in the previous preamble (i.e. the preamble we're trying to patch) so we (heuristically) know what they resolve to.
> If we used the HeaderSearchInfo's alias map to redirect <foo.h> to </absolute/path/to/foo.h> then `#include <foo.h>` will just cost a single stat. And when our heuristic is wrong and <foo.h> resolves to something else, we'll just get an include-not-found, we can still add the #include.
> 
> More radical workaround: give it a completely null VFS so it can't do any IO. All includes will fail to resolve, but if we don't need their resolved path at this point we don't care, right?
the choice of raw lexer was more of a shortcut. i was just trying to get an initial version, tried to put some ideas in comments.

Will experiment with couple of options, but using alias mapping to map written includes to resolved absolute paths and also using the statcached vfs sounds like the best thing we can do for now.

We should also put this behind a "raw string equality" check to not perform any IO in the most common case(preamble is same).

Moreover, we can also put this logic into TUScheduler to not even perform string equality checks in case the contents haven't been modified. Not sure how useful that'll be though, as we usually receive a signaturehelp request after a modification to the source code.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77392/new/

https://reviews.llvm.org/D77392





More information about the cfe-commits mailing list