[PATCH] D77811: [clangd] Implement semanticTokens modifiers

Sun Jan 31 23:56:00 PST 2021

nridge added a comment.

Thanks a lot for working on this! I agree this is well aligned with the goals of D77702 <https://reviews.llvm.org/D77702> (and in fact goes even further and meets the goals of D66990 <https://reviews.llvm.org/D66990> via the `declaration` modifier). The other modifiers are neat as well; I like the expressiveness of the kind + modifier model in general.

> This addresses some of the goals of D77702 <https://reviews.llvm.org/D77702>, but leaves some things undone.
> Mostly because I think these will want some discussion.
>
> - no split between dependent type/name. (We may want to model this as a modifier, type+dependent vs ???+dependent)

+1

> - no split between primitive/typedef. (Is introducing a nonstandard kind is worth this distinction?)

Here's my general take on this:

- There is too much variety between programming languages to expect that a standard list of highlighting kinds is going to satisfactorily cover all languages.
- If LSP aims to be the basis for tools that are competitive with purpose-built language-specific tools, it's reasonable to allow for clients willing to go the extra mile in terms of customization (e.g. use a language-specific theme which provides colors for language-specific token kinds).
- Therefore, I would say yes, we should take the liberty of adding nonstandard highlighting kinds and modifiers where it makes sense from a language POV.

That said... for typedef specifically, I wonder if it actually makes more sense as a modifier than a kind. That is, have the kind be the target type (falling back to Unknown/Dependent if we don't know), and have a modifier flag for "the type is referred to via an alias" (as opposed to "the type is referred to directly"). WDYT?

> - no nonstandard local attribute This probably makes sense, I'm wondering if we want others and how they fit together.

Are you referring to local-scope variables (i.e. `FunctionScope` in D95701 <https://reviews.llvm.org/D95701>)? If so, I think the approach in that patch is promising.

================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:132
+    return isConst(AT->getElementType());
+  return isConst(T->getPointeeType());
+}
----------------
It seems a bit unintuitive that the path which non-pointer types (e.g. unqualified class or builtin types) take is to return false in this recursive call because `getPointeeType()` returns null... but I guess that works.

================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:136
+// Whether D is const in a loose sense (should it be highlighted as such?)
+bool isConst(const Decl *D) {
+  if (llvm::isa<EnumConstantDecl>(D) || llvm::isa<NonTypeTemplateParmDecl>(D))
----------------
Do you think in the future it might make sense to have the `readonly` modifier reflect, at least for variables, whether //the particular reference// to the variable is a readonly reference (e.g. binding the variable to a `const X&` vs. an `X&`)?

================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:184
+  if (const auto *FD = llvm::dyn_cast<FieldDecl>(D))
+    return FD->isCXXClassMember() && !FD->isCXXInstanceMember();
+  if (const auto *OMD = llvm::dyn_cast<ObjCMethodDecl>(D))
----------------
Will anything take this path (rather than being covered by `VD->isStaticDataMember()` above)?

================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.cpp:476
+      if (!Kind || (Result && Kind != Result))
+        continue;
+      Result = Kind;
----------------
This is a change of behaviour from before, in the case where the `ReferenceLoc` has multiple targets.

Before, we would produce at most one highlighting token for the `ReferenceLoc`. In the case where different target decls had different highlighting kinds, we wouldn't produce any.

Now, it looks like we produce a separate token for every target whose kind matches the kind of the first target (and skip targets with a conflicting kind).

Is that the intention?

It seems a bit strange: if we allow multiple tokens, why couldn't they have different kinds?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/AnnotateHighlightings.cpp:11
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Support/ScopedPrinter.h"

----------------
Maybe add `// for llvm::to_string` at it's not obvious

================
Comment at: clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp:264
+        $Class[[D]] $Field_decl[[E]];
+        static double $StaticField_decl_static[[S]];
+        static void $StaticMethod_decl_static[[bar]]() {}
----------------
Presumably, the highlighting kinds `StaticField` and `StaticMethod` are going to be collapsed into `Field` and `Method` in a future change (after the removal of TheiaSemanticHighlighting, I guess)?

================
Comment at: clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp:718
+      };
+      )cpp",
+  };
----------------
Should we add a test case for `_deprecated` as well?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77811/new/

https://reviews.llvm.org/D77811