[PATCH] D78521: [clangd] Extend dexp to support remote index

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 23 13:34:12 PDT 2020


sammccall added inline comments.


================
Comment at: clang-tools-extra/clangd/Features.inc.in:2
 #define CLANGD_BUILD_XPC @CLANGD_BUILD_XPC@
+#define CLANGD_REMOTE @CLANGD_ENABLE_REMOTE@
----------------
nit: can we use the same name for the cmake variable and the preprocessor define?

There's a lot of potential for conceptual confusion between them, but I don't think giving them subtly different names helps, I'd rather make it just work for confused people :-)


================
Comment at: clang-tools-extra/clangd/index/dex/dexp/Dexp.cpp:14
 
+#include "Features.inc.in"
 #include "SourceCode.h"
----------------
Features.inc - that's the version that's been cmake-substituted
(I'm not sure why this isn't a regular include-guarded header, let's fix that separately)


================
Comment at: clang-tools-extra/clangd/index/dex/dexp/Dexp.cpp:18
 #include "index/dex/Dex.h"
+#include "index/remote/Client.h"
 #include "llvm/ADT/ScopeExit.h"
----------------
Being able to include this header but not call the function in it doesn't make sense - linker errors aren't a friendly way to surface this constraint.

If the goal is to forbid use of the API without it compiled in, the header should `#ifndef #CLANGD_ENABLE_REMOTE #error "Client.h included but CLANGD_ENABLE_REMOTE is off" #endif`, and all the include-sites should be #ifdef-guarded.

(But please see other comments first)


================
Comment at: clang-tools-extra/clangd/index/dex/dexp/Dexp.cpp:43
+// FIXME(kirillbobyrev): This is not working properly for some reason.
+llvm::cl::opt<bool>
+    RemoteMode("remote",
----------------
I'd consider rather making this a string prefix on `IndexLocation`, so you'd invoke as `dexp remote:localhost:4004` or whatever.
This encapsulates more easily, e.g. we can eventually move this into `openIndex` so all the callers transparently support the remote index.
(Though we shouldn't push that dependency into clangd in *this* patch, it just makes it easier in future)


================
Comment at: clang-tools-extra/clangd/index/remote/Client.h:21
+///
+/// Caller is not blocked upon SymbolIndex request, the actual connection must
+/// happen upon the first request. RPCs have no timeout.
----------------
I can't follow this sentence - it's not clear what "SymbolIndex request" or "the actual connection" are.

I guess you mean something like "This method blocks to resolve the address, but does not wait for the channel to become ready".



================
Comment at: clang-tools-extra/clangd/index/remote/Client.h:21
+///
+/// Caller is not blocked upon SymbolIndex request, the actual connection must
+/// happen upon the first request. RPCs have no timeout.
----------------
sammccall wrote:
> I can't follow this sentence - it's not clear what "SymbolIndex request" or "the actual connection" are.
> 
> I guess you mean something like "This method blocks to resolve the address, but does not wait for the channel to become ready".
> 
The blocking behaviour is interesting:
 - clangd should keep working while offline or in bad connectivity (thus no timeout on RPCs is a bug, we shouldn't rely on the channel outright detecting failure)
 - this implies you don't want to return null if the connection can't be established, and so there's indeed not much point blocking. But blocking on resolving the name removes a lot of the benefit here.

I think GRPC actually has a pretty good model, it's documented here: https://github.com/grpc/grpc/blob/master/doc/connectivity-semantics-and-api.md
(may be worth adding a link).
I'm not sure if this is actually blocked on name resolution, but it seems like it's at most blocked on *one attempt* at name resolution: the channel can still be created if resolution hasn't succeeded yet.


================
Comment at: clang-tools-extra/clangd/index/remote/Client.h:22
+/// Caller is not blocked upon SymbolIndex request, the actual connection must
+/// happen upon the first request. RPCs have no timeout.
+///
----------------
RPCs have no timeout should be a FIXME


================
Comment at: clang-tools-extra/clangd/index/remote/Client.h:29
+///   function's implementation will not be compiled.
+/// * Even though the address is resolved during the function call, the server
+///   availability is only required during RPC calls.
----------------
I'm not sure what this is saying or why a caller would want to know it.


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:97
+std::unique_ptr<SymbolIndex> connect(llvm::StringRef Address) {
+#ifdef CLANGD_REMOTE
+  return std::unique_ptr<SymbolIndex>(new IndexClient(Address));
----------------
kbobyrev wrote:
> sammccall wrote:
> > if  remote is disabled we can't compile the rest of this file.
> > Rather than ifdefs here, I'd suggest doing it at the cmake level:
> > 
> > ```
> > if(CLANGD_ENABLE_REMOTE)
> >   add_clang_library(... Index.cpp)
> > else()
> >   add_clang_library(... IndexUnimplemented.cpp)
> > endif()
> > ```
> I would argue that including `index/remote/Client.h` without `CLANGD_ENABLE_REMOTE` is not correct. We would still have to put `#ifdefs` in the user code regardless of whether what you proposed is implemented. I don't see any benefits in allowing users to include `index/remote/Client.h`, link `clangdRemoteIndex` and getting a runtime error. All of those steps have _remote_ in them and if _remote mode_ is not enabled, something certainly went wrong.
> 
> Also, this will complicate CMake structure as I can't put files that are conditionally added/excluded from the clang library and I would either have to make a separate directory with an "empty" library, or put a bunch of `#ifdefs` in here. Either is not optimal and I think it'd be better to leave it like this. WDYT?
> I would argue that including index/remote/Client.h without CLANGD_ENABLE_REMOTE is not correct. We would still have to put #ifdefs in the user code regardless of whether what you proposed is implemented

This is begging the question - what I'm proposing is to give that header/library defined behavior when CLANGD_ENABLE_REMOTE is off, and so you wouldn't need #ifdefs.

> I don't see any benefits in allowing users to include index/remote/Client.h, link clangdRemoteIndex and getting a runtime error.

The benefits are:
 - no ifdefs and fewer conditional cmake rules, which are both hard to read and result in untested constructs without even basic syntax checking
 - fewer cases to consider in clients, as this falls into the regular "can't connect to index" codepath
 - readers trying to follow the possible behaviours end up at the header documentation of a function that is called, which is an easy and familiar workflow

> All of those steps have _remote_ in them and if _remote mode_ is not enabled, something certainly went wrong

Again, this is begging the question. If the header says "if GRPC mode is not enabled, always returns nullptr", then everything is working as designed.

> Also, this will complicate CMake structure as I can't put files that are conditionally added/excluded from the clang library

Why not? (This sounds like a fair technical objection, but may be surmountable)


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:1
+//===--- Index.cpp -----------------------------------------------*- C++-*-===//
+//
----------------
This implements Client.h, should be Client.cpp


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:14
+#include "Trace.h"
+#include "grpcpp/grpcpp.h"
+#include "index/Serialization.h"
----------------
I'd consider using `<> for this include, or at least splitting it into a different section - non-LLVM non-stdlib includes are really rare.


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:26
+void streamRPC(
+    llvm::StringRef SpanName, SymbolIndex::Stub *Stub, ClangdRequestT Request,
+    std::function<std::unique_ptr<grpc::ClientReader<ReplyT>>(
----------------
no need to pass the stub if you make this a member func


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:27
+    llvm::StringRef SpanName, SymbolIndex::Stub *Stub, ClangdRequestT Request,
+    std::function<std::unique_ptr<grpc::ClientReader<ReplyT>>(
+        SymbolIndex::Stub *, grpc::ClientContext *, const RPCRequestT &)>
----------------
Hmm, I think member pointers are totally the right thing here. And all the params should be deducible.

Have a look at https://godbolt.org/z/AnprJ- (simplified but self-contained and I think shows all the bits)

The thing I can't work out how to do is make the member pointer a template param *and* have it be deduced...


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:31
+    llvm::function_ref<void(ReplyT, llvm::UniqueStringSaver *)> Callback) {
+  trace::Span Tracer(SpanName);
+  const auto RPCRequest = toProtobuf(Request);
----------------
SpanName can be RPCRequestT::descriptor()->name() (e.g. "LookupRequest") rather than having to pass it explicitly


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:46
+  IndexClient(llvm::StringRef Address)
+      : Stub(remote::SymbolIndex::NewStub(grpc::CreateChannel(
+            Address.str(), grpc::InsecureChannelCredentials()))) {}
----------------
not even attempting to open the connection on startup seems to introduce more latency on first request than necessary. Call Channel->GetState(true)?
That won't block, but it will cause the connection to establish in the background.


================
Comment at: clang-tools-extra/clangd/index/remote/Index.cpp:68
+        &remote::SymbolIndex::Stub::FuzzyFind,
+        [&](const FuzzyFindReply &Message, llvm::UniqueStringSaver *Strings) {
+          if (!Message.has_stream_result()) {
----------------
this callback is almost the same - if you added a dummy stream result to Lookup you could move it into streamrpc :-)


================
Comment at: clang-tools-extra/clangd/index/remote/Marshalling.h:25
+clangd::FuzzyFindRequest fromProtobuf(const FuzzyFindRequest *Request);
+llvm::Expected<clangd::Symbol> fromProtobuf(const Symbol &Message,
+                                            llvm::UniqueStringSaver *Strings);
----------------
You could also just log the error and return Optional at this level... there's no recovery other than logging anyway. I kind of hate Expected/Error because of their asserting and weird non-constness so I'd be tempted by that.
Up to you though


================
Comment at: clang-tools-extra/clangd/index/remote/server/CMakeLists.txt:17
 
-  protobuf
-  grpc++
-  clangDaemon
+  clangdRemoteIndex
   )
----------------
why does the server depend on the client?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78521/new/

https://reviews.llvm.org/D78521





More information about the cfe-commits mailing list