[PATCH] D52276: [clangd] Add type boosting in code completion

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 23 06:17:01 PST 2018


sammccall added inline comments.


================
Comment at: clangd/CodeComplete.cpp:1307
                   getCompletionKindString(Recorder->CCContext.getKind()));
       log("Code complete: sema context {0}, query scopes [{1}] (AnyScope={2})",
           getCompletionKindString(Recorder->CCContext.getKind()),
----------------
Here would be a good place to include the expected type in a log message


================
Comment at: clangd/CodeComplete.cpp:1519
+            PreferredType->raw() == Candidate.IndexResult->Type) {
+          vlog("Types matched in index for {0}", Candidate.Name);
+          Relevance.TypeMatchesPreferred = true;
----------------
This is too verbose for vlog... could be dlog but I'm not sure it's actually needed in the end, we have code completion signal dumps. You should update Quality.cpp so the new signal is included :-)


================
Comment at: clangd/Quality.h:98
+  /// Whether the item matches the type expected in the completion context.
+  bool TypeMatchesPreferred = false;
   /// FIXME: unify with index proximity score - signals should be
----------------
ilya-biryukov wrote:
> sammccall wrote:
> > sammccall wrote:
> > > you've inserted in the middle of the file proximity stuff :-)
> > Generally we'd put both context/symbol types as the signal here, rather than just whether they match, unless it's prohibitive. They'd get populated manually, and by merge() overloads, respectively.
> I did it to avoid inefficiencies of:
> 1. copying the context type for each completion item,
> 2. copying the symbol type for each of the indexed items.
> 
> My understanding is that (1) can be avoided by storing a reference to a type, since it outlives the signals.
> But we'll still have to do a copy for (2), right?
> 
> That's probably not a bottleneck anyway, but keeping it a bool flag for now and happy to change it  to your liking.
Agree with your analysis.

I think our goal here should be to have a model with small # of per-symbol types, and small size of each type, so the per-symbol copy would be cheap. Sequencing is tricky - we don't have that optimization yet. Can we replay a session with some code completion and see if it shows up in a profiler?

(Perfect would be one type per symbol, types are 8-byte hashes. Unfortunately I think pointer->base conversions mean symbols must admit multiple tokens. But I have some ideas...)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D52276





More information about the cfe-commits mailing list