[PATCH] D42640: [clangd] Prototype: collect symbol #include & insert #include in global code completion.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 2 10:08:59 PST 2018


sammccall added a comment.

TL;DR:

- I think we can expose fewer types/abstractions in the header mapping, and put it all in one header. If we want some functionality to be global-index only, we should make it an option to the symbolcollector, not stash the logic somewhere else I think.
- regex seems like a big hammer, but we can revisit if this is on the hot path
- the overridden path should probably be an "include string" rather than a URI
- lots of naming nits as usual :-)
- I haven't reviewed the actual preprocessor/include-path madness yet :-)



================
Comment at: clangd/ClangdLSPServer.cpp:161
+    ApplyEdit.edit = WE;
+    reply(C, "Inserted header " + Params.includeInsertion->headerUri);
+    call(C, "workspace/applyEdit", ApplyEdit);
----------------
The caveat from apply fix also applies here. Maybe we should combine the tail of this case and the tail of apply-fix (the part from WorkspaceEdit)? They're simple (and wrong) now, but will need the same changes.


================
Comment at: clangd/ClangdServer.cpp:411
 
+Expected<tooling::Replacements>
+ClangdServer::insertInclude(const Context &Ctx, PathRef File, StringRef Code,
----------------
(I still want to look at  this bit... on monday :-)


================
Comment at: clangd/CodeComplete.cpp:254
   // Builds an LSP completion item.
-  CompletionItem build(const CompletionItemScores &Scores,
+  CompletionItem build(llvm::StringRef FileName, llvm::StringRef Contents,
+                       const CompletionItemScores &Scores,
----------------
contents is unused


================
Comment at: clangd/CodeComplete.cpp:289-290
+        if (!D->HeaderUri.empty()) {
+          Command Cmd;
+          Cmd.title = "Insert #include";
+          Cmd.command = ExecuteCommandParams::CLANGD_INSERT_HEADER_INCLUDE;
----------------
not sure where this is shown, but maybe add "if needed"?


================
Comment at: clangd/CodeComplete.cpp:296
+          Cmd.includeInsertion = std::move(Insertion);
+          I.command = std::move(Cmd);
+        }
----------------
LSP says we should use additionalTextEdits instead.
I think we have OK reasons, but you probably want to mention it here.


================
Comment at: clangd/CodeComplete.cpp:836
                        if (Recorder.CCSema)
-                         Output = runWithSema();
+                         Output = runWithSema(SemaCCInput.FileName,
+                                              SemaCCInput.Contents);
----------------
What do you think about (redundantly) passing filename to the constructor, and stashing it in a member, instead of passing to all the methods here?
I think main purpose of having this class is not having the *primary* data flow obscured by all the various context variables.

If you don't like that, we could also pass the SemaCCInput struct to the constructor itself. The distinction between constrcutor params and run() params is pretty artificial.

(as noted above, I think/hope Contents is unused in any case)


================
Comment at: clangd/Protocol.cpp:370
     Result["additionalTextEdits"] = json::ary(CI.additionalTextEdits);
+  if (CI.command) {
+    Result["command"] = *CI.command;
----------------
nit: drop braces consistent with the other code around here


================
Comment at: clangd/Protocol.h:380
+
+  std::string headerUri;
+  /// Note: "documentChanges" is not currently used because currently there is
----------------
I would call this "header". At least in the current implementation, it's not a URI, and we we really don't want anyone trying to parse it.


================
Comment at: clangd/Protocol.h:381
+  std::string headerUri;
+  /// Note: "documentChanges" is not currently used because currently there is
+  /// no support for versioned edits.
----------------
I don't think this comment applies here?


================
Comment at: clangd/Protocol.h:398
   const static llvm::StringLiteral CLANGD_APPLY_FIX_COMMAND;
 
+  const static llvm::StringLiteral CLANGD_INSERT_HEADER_INCLUDE;
----------------
nit: remove blank line (and below). These form a list.


================
Comment at: clangd/Protocol.h:412
 
+struct Command {
+  std::string title;
----------------
Hmm, you could consider `struct Command : public ExecuteCommandParams { std::string title; }`

I guess it's a hack, but this repetition seems a little silly. You could then reuse some of the JSON stuff too.


================
Comment at: clangd/global-symbol-builder/PragmaCommentHandler.h:1
+//===-- PragmaCommentHandler.h - find all symbols----------------*- C++ -*-===//
+//
----------------
can this also be put in HeaderMap.h?


================
Comment at: clangd/global-symbol-builder/PragmaCommentHandler.h:28
+/// https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/IWYUPragmas.md#iwyu-pragma-private
+class PragmaCommentHandler : public clang::CommentHandler {
+public:
----------------
I'm not sure this class needs to be exposed, can we just expose a function `unique_ptr<CommentHandler> collectHeaderMaps(HeaderMapCollector*)`?


================
Comment at: clangd/global-symbol-builder/STLPostfixHeaderMap.h:1
+//===-- STLPostfixHeaderMap.h - hardcoded header map for STL ----*- C++ -*-===//
+//
----------------
STL isn't really a name we want to be using :-)

Why isn't this in HeaderMapCollector.h?


================
Comment at: clangd/index/HeaderMapCollector.h:9
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_HEADERMAPCOLLECTOR_H
----------------
context! what are header mappings, why do we use them (example)


================
Comment at: clangd/index/HeaderMapCollector.h:23
+/// complete header names or header name regex patterns to header names.
+class HeaderMapCollector {
+public:
----------------
I'm not sure "header mapping" is as descriptive as it should be, and "collector':
 - we're mapping *to a canonical* representation
 - the context is "how should we spell this in a #include"

"Collector" doesn't really describe the class - it both records and applies rules.

Maybe CanonicalIncludes?


================
Comment at: clangd/index/HeaderMapCollector.h:25
+public:
+  typedef llvm::StringMap<std::string> HeaderMap;
+  typedef std::vector<std::pair<const char *, const char *>> RegexHeaderMap;
----------------
these public typedefs and the explicit constructor seem leaky. Maybe just expose a public function `addStandardLibraryMapping(CanonicalIncludes*)`?


================
Comment at: clangd/index/HeaderMapCollector.h:31
+
+  void addHeaderMapping(llvm::StringRef OrignalHeaderPath,
+                        llvm::StringRef MappingHeaderPath) {
----------------
nit: just addMapping?


================
Comment at: clangd/index/HeaderMapCollector.h:41
+  /// a mapped header name.
+  llvm::StringRef getMappedHeader(llvm::StringRef Header) const;
+
----------------
nit: just mapHeader?


================
Comment at: clangd/index/HeaderMapCollector.h:41
+  /// a mapped header name.
+  llvm::StringRef getMappedHeader(llvm::StringRef Header) const;
+
----------------
sammccall wrote:
> nit: just mapHeader?
what's the input? absolute path? working-directory-relative? include-path relative?

Similarly, what's the output?

(This also relates to addHeaderMapping, so might belong in the class desc)


================
Comment at: clangd/index/HeaderMapCollector.h:48
+  // A map from header patterns to header names.
+  // The header names are not owned. This is only threadsafe because the regexes
+  // never fail.
----------------
This comment and the mutable are scary. We don't need to be threadsafe I think?


================
Comment at: clangd/index/HeaderMapCollector.h:50
+  // never fail.
+  mutable std::vector<std::pair<llvm::Regex, const char *>>
+      RegexHeaderMappingTable;
----------------
all the rules we have so far are suffix matches. It seems a little weird to allow arbitrary regexes (and prevents fast lookup by sorting the reversed strings).

But it's a small amount of code and I guess the table will stay small. Maybe a FIXME that performance could be improved here?


================
Comment at: clangd/index/Index.h:155
     llvm::StringRef CompletionDetail;
+    /// A URI for the header to be #include'd for this symbol, or a header like
+    /// "<...>" for system headers that are not repository independent (e.g. STL
----------------
I think we concluded this shouldn't be a URI but a literal include string. If absent, we should fall back to CanonicalDeclaration.

The reason is that we choose to interpret the target of IWYU directives as a literal string to include. Alternatively we could choose to resolve it somehow and store the URI to the resolved target, but that would be more work.


================
Comment at: clangd/index/SymbolCollector.cpp:113
+// to define.
+bool shouldCollectIncludePath(index::SymbolKind Kind) {
+  using SK = index::SymbolKind;
----------------
shouldn't this just be return SK != Namespace, to avoid duplication? Since we only get here if we're indexing this symbol?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D42640





More information about the cfe-commits mailing list