[PATCH] D54799: [clangd][WIP] textDocument/CursorInfo method

Jan Korous via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 23 04:57:27 PST 2018


jkorous marked 7 inline comments as done.
jkorous added a comment.

In https://reviews.llvm.org/D54799#1306585, @sammccall wrote:

> So I think both SymbolID and USR are optional.


No problem.
I am just wondering if it make sense to include any symbol with empty name, empty USR and no ID in LSP response. I assume either clangd or our LSP client (and hypothetically others too) will have to filter these out. But it might violate consistency with other LSP methods. Anyway, if we agree on filtering such symbols either in `getSymbolInfo()` or `ClangdServer::getSymbolInfo()` I am happy to do it, if not it's fine with me.



================
Comment at: ClangdServer.h:205
+  /// Get cursor info for given position.
+  void getCursorInfo(PathRef File, Position Pos,
+                     Callback<llvm::Optional<IdentifierData>> CB);
----------------
sammccall wrote:
> sammccall wrote:
> > nit: just `cursorInfo` (or `symbolInfo` etc) for consistency with other where there isn't a specific verb.
> (not done I think? Still says `getSymbolInfo`)
Sorry, my bad - I missed the verb dropping. Fixing now.
BTW if we want consistency here we should probably rename `findReferences()`, `findHover()` too.


================
Comment at: Protocol.h:689
+  /// Size is dependent on clang::clangd::SymbolID::RawValue.
+  llvm::SmallString<32> ID;
+};
----------------
sammccall wrote:
> jkorous wrote:
> > sammccall wrote:
> > > can this just be SymbolID?
> > I didn't want to pull `SymbolID` from `index` originally but sure.
> > 
> > Are you fine with it living in `Protocol.h` or shall I move it/factor it out?
> SymbolID shouldn't live in `Protocol.h`.
> 
> It hadn't occurred to me, but including all of `Index/Index.h` is risky here - it's pretty broad in scope and we run the risk of a circular dependency down the line. Sorry for not noticing!
> 
> I'm fine with any of:
>  - pull out SymbolID into `Index/SymbolID.h` which would have few dependencies
>  - revert to using std::string here as you originally did
>  - depend on index.h for now, and solve this when it becomes a problem
To me the cleanest solution seems to be `Index/SymbolID.h`.

Generally if we ever get to a point where we'd hit this kind of issues more often it might be worthwhile to keep data structures for LSP protocol and for server implementation separate. But I don't think we're anywhere close yet.


================
Comment at: Protocol.h:690
+  llvm::SmallString<32> ID;
+};
+llvm::json::Value toJSON(const IdentifierData &);
----------------
sammccall wrote:
> jkorous wrote:
> > sammccall wrote:
> > > I think SymbolKind would be useful (in particular distinguishing macros might be important?)
> > > 
> > > This does point towards maybe adding to SymbolInformation rather than a new struct, but I'm not sure which is best.
> > AFAIK for our use-case the information in USR is good enough (e. g. "c:foo.cpp at 24@macro at FOO") so we might wait until we have a real need. WDYT?
> I would personally advise against parsing information out of USRs, but up to you.
> (On the other hand if you're just using them as keys to look up information from elsewhere, that makes sense)
I assume that's what we do but I'll let @benlangmuir or @arphaman confirm this.


================
Comment at: XRefs.cpp:95
   Preprocessor &PP;
+  const bool StopOnFirstDeclFound;
 
----------------
hokein wrote:
> sammccall wrote:
> > jkorous wrote:
> > > sammccall wrote:
> > > > Please don't do this - it's inconsistent with the other XRefs features.
> > > > (If for some reason you *really* want just one result, then this is easy to do on the client side if we get the API right)
> > > Sure, let's discuss. I probably got confused by your suggestion of using `getSymbolAtPosition()` in https://reviews.llvm.org/D54529#1299531.
> > > 
> > > Do I understand it correctly that you mean visiting all declarations and selecting the single explicit one in `getSymbolInfo()`?
> > > Or do you actually mean change of interface such as this? `llvm::Optional<SymbolDetails> getSymbolInfo()` -> `std::vector<SymbolDetails> getSymbolInfo()`.
> > > Is my assumption that there could be only single explicit declaration for any given position correct in the first place?
> > I looked into this, and the multiple-symbols is more of a mess than I thought it was.
> > 
> > At a high level: I think it's important that various features that use "symbol under cursor" are consistent. If you do defs or refs or highlights at a point, you should be querying the same set of symbols. And the same goes for anything that starts with a symbolInfo call.
> > 
> > Unfortunately the way the sorting was introduced in rL341463 meant the existing things are not quite consistent, which confuses things a bit.
> > 
> > But at a high level:
> >  - the status quo is that findDefinition/findRefs/highlights deal with multiple symbols. So yes, `symbolInfo` should return a `vector<SymbolDetails>` as things stand.
> >  - the multiple-symbols situation is a bit surprising and maybe not that useful. It might be possible to change this behavior. It should be changed for all methods (i.e. either we land this patch with `vector<SymbolDetails>` and then (breaking) change it to `Optional<SymbolDetails>` later, or we remove the multiple-symbols behavior first.
> >  - I don't know what the deal is with "explicit". The cases with multiple symbols are already obscure, cases with multiple explicit symbols are probably even rarer if they exist. But I don't think it's particularly safe to assume they don't. And getSymbolInfo really shouldn't be adding its own unique logic around this.
> > 
> > @hokein @ilya-biryukov We should chat about the historical context here.
> I have the same question when I first touched this code :) 
> IIRC, we want all symbols in GoToDef (as we can show multiple symbols to clients). Unfortunately, there are some inconsistencies between `def` (return all), `ref` (only return explicit symbols), `hover` (return the first one).
> 
> I agree that most cases are single-symbol, but the typical case with multiple-symbols is below. Personally, I'm +1 on removing the multiple-symbols behavior, it can simplify many things...
> 
> ```
> namespace ns {
> void foo(int);
> void foo(double);
> }
> using ns::f^oo;
> ```
Thanks a lot @hokein! This is really helpful because I was assuming there can be just a single explicit symbol the whole time. Probably because `getNamedDeclAt()` which was our starting point makes that assumption too.

I reread the discussion and agree with @sammccall about keeping the behavior consistent.

My thoughts are:
- I am going to use `vector<SymbolDetails>` as a return type.
- We need to land this patch soon-ish (ideally before the end of next week, hard deadline is mid-December) so we'd prefer not getting blocked by changing all XRefs functionality to single symbol.
- I assume our client can deal with any (rare) occurrence of multiple symbols in some reasonable fashion (use the first one, ignore all or similar). @benlangmuir could you please confirm this?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D54799





More information about the cfe-commits mailing list