[PATCH] D54799: [clangd][WIP] textDocument/CursorInfo method

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 23 01:01:28 PST 2018


sammccall added a subscriber: hokein.
sammccall added a comment.

In https://reviews.llvm.org/D54799#1306488, @jkorous wrote:

> >> - conditional return in `getCursorInfo` - Should we return for example data with empty `USR`?
> > 
> > Please return a symbol unless it has no SymbolID (we don't treat those as symbols in clangd).
>
> Ok.


Sorry, this was a mistake - we do in fact use decls that don't have symbol IDs (we just don't look them up in the index).
So I think both SymbolID and USR are optional.



================
Comment at: ClangdServer.h:205
+  /// Get cursor info for given position.
+  void getCursorInfo(PathRef File, Position Pos,
+                     Callback<llvm::Optional<IdentifierData>> CB);
----------------
sammccall wrote:
> nit: just `cursorInfo` (or `symbolInfo` etc) for consistency with other where there isn't a specific verb.
(not done I think? Still says `getSymbolInfo`)


================
Comment at: Protocol.h:684
+
+  /// The USR of this symbol.
+  llvm::SmallString<128> USR;
----------------
jkorous wrote:
> sammccall wrote:
> > USR could use some definition/references.
> True. Would you go further than expanding the acronym and copying description from doxygen annotation of `SymbolID`?
> I haven't actually found any better description in clang repository.
> 
> Shall I note `include/clang/Index/USRGeneration.h`?
I'd add something like:

```
This is an opaque string uniquely identifying a symbol.
Unlike SymbolID, it is variable-length and somewhat human-readable.
It is a common representation across several clang tools (See USRGeneration.h)
```


================
Comment at: Protocol.h:689
+  /// Size is dependent on clang::clangd::SymbolID::RawValue.
+  llvm::SmallString<32> ID;
+};
----------------
jkorous wrote:
> sammccall wrote:
> > can this just be SymbolID?
> I didn't want to pull `SymbolID` from `index` originally but sure.
> 
> Are you fine with it living in `Protocol.h` or shall I move it/factor it out?
SymbolID shouldn't live in `Protocol.h`.

It hadn't occurred to me, but including all of `Index/Index.h` is risky here - it's pretty broad in scope and we run the risk of a circular dependency down the line. Sorry for not noticing!

I'm fine with any of:
 - pull out SymbolID into `Index/SymbolID.h` which would have few dependencies
 - revert to using std::string here as you originally did
 - depend on index.h for now, and solve this when it becomes a problem


================
Comment at: Protocol.h:690
+  llvm::SmallString<32> ID;
+};
+llvm::json::Value toJSON(const IdentifierData &);
----------------
jkorous wrote:
> sammccall wrote:
> > I think SymbolKind would be useful (in particular distinguishing macros might be important?)
> > 
> > This does point towards maybe adding to SymbolInformation rather than a new struct, but I'm not sure which is best.
> AFAIK for our use-case the information in USR is good enough (e. g. "c:foo.cpp at 24@macro at FOO") so we might wait until we have a real need. WDYT?
I would personally advise against parsing information out of USRs, but up to you.
(On the other hand if you're just using them as keys to look up information from elsewhere, that makes sense)


================
Comment at: XRefs.cpp:95
   Preprocessor &PP;
+  const bool StopOnFirstDeclFound;
 
----------------
jkorous wrote:
> sammccall wrote:
> > Please don't do this - it's inconsistent with the other XRefs features.
> > (If for some reason you *really* want just one result, then this is easy to do on the client side if we get the API right)
> Sure, let's discuss. I probably got confused by your suggestion of using `getSymbolAtPosition()` in https://reviews.llvm.org/D54529#1299531.
> 
> Do I understand it correctly that you mean visiting all declarations and selecting the single explicit one in `getSymbolInfo()`?
> Or do you actually mean change of interface such as this? `llvm::Optional<SymbolDetails> getSymbolInfo()` -> `std::vector<SymbolDetails> getSymbolInfo()`.
> Is my assumption that there could be only single explicit declaration for any given position correct in the first place?
I looked into this, and the multiple-symbols is more of a mess than I thought it was.

At a high level: I think it's important that various features that use "symbol under cursor" are consistent. If you do defs or refs or highlights at a point, you should be querying the same set of symbols. And the same goes for anything that starts with a symbolInfo call.

Unfortunately the way the sorting was introduced in rL341463 meant the existing things are not quite consistent, which confuses things a bit.

But at a high level:
 - the status quo is that findDefinition/findRefs/highlights deal with multiple symbols. So yes, `symbolInfo` should return a `vector<SymbolDetails>` as things stand.
 - the multiple-symbols situation is a bit surprising and maybe not that useful. It might be possible to change this behavior. It should be changed for all methods (i.e. either we land this patch with `vector<SymbolDetails>` and then (breaking) change it to `Optional<SymbolDetails>` later, or we remove the multiple-symbols behavior first.
 - I don't know what the deal is with "explicit". The cases with multiple symbols are already obscure, cases with multiple explicit symbols are probably even rarer if they exist. But I don't think it's particularly safe to assume they don't. And getSymbolInfo really shouldn't be adding its own unique logic around this.

@hokein @ilya-biryukov We should chat about the historical context here.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D54799





More information about the cfe-commits mailing list