[PATCH] D136594: [clangd] Add support for semantic token type "operator"

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 25 02:57:22 PST 2022


sammccall added a comment.

Regarding AugmentsSyntaxTokens:

- in general this seems like a nice thing to support, though I'm not aware of "big" `!AugmentsSyntaxTokens` clients so it's not urgent
- I think the right design for that is to (conditionally) run the lexer and mark only tokens that haven't been marked yet. If the lexer doesn't provide enough info, then it's maybe not "syntactic".
- I think it's OK to "accidentally" produce some "syntactic" tokens in semantic-only mode - we shouldn't go out of our way to avoid doing so.

Since `&`, `&&`, `*` aren't always operators, and this is not syntactically distinguishable, I think we should just always emit all operator tokens, even in the standard `AugmentsSyntaxTokens` mode, and even for tokens that can only be operators. (In practice many other "obvious" operators aren't always either: `~` (destructor), `^` (objc blocks), `=` (lambda capture), `#` (directive vs pasting), `+-` (objc method specifiers)...)

This means AugmentSyntaxTokens doesn't belong in this patch, I think - as Nathan said we want to emit these tokens regardless. (Really sorry for the churn...)



================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.h:76
   ConstructorOrDestructor,
+  UserProvided,
 
----------------
nridge wrote:
> ckandeler wrote:
> > ckandeler wrote:
> > > sammccall wrote:
> > > > sammccall wrote:
> > > > > sammccall wrote:
> > > > > > Can you give some background here or on the bug tracker about what kind of distinction you're trying to draw here and why it's important?
> > > > > > (Most clients are never going to benefit from nonstandard modifiers so they should be pretty compelling)
> > > > > as well as being jargony, "user-provided" has a specific technical meaning that I don't think you intend here. For example, `auto operator<=>(const S&) const = default` is not user-provided (defaulted on first declaration). https://eel.is/c++draft/dcl.fct.def.default#5
> > > > > 
> > > > > If we need this and can't get away with reusing `defaultLibrary` (which would include `std::`) then maybe we should add something like `builtin` which seems quite reusable.
> > > > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > > > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> > > > as well as being jargony, "user-provided" has a specific technical meaning that I don't think you intend here. For example, `auto operator<=>(const S&) const = default` is not user-provided (defaulted on first declaration). https://eel.is/c++draft/dcl.fct.def.default#5
> > > > 
> > > > If we need this and can't get away with reusing `defaultLibrary` (which would include `std::`) then maybe we should add something like `builtin` which seems quite reusable.
> > > 
> > > I use "userProvided" here in the sense of "not built-in" or "overloaded". I do not cling to the term, and have also suggested the opposite default of using "builtin" in another comment.
> > > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > 
> > True, I have not considered this.
> > 
> > > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> > 
> > Intuitively, it seems we should be conservative and not claim the operator is overloaded unless we know it is. So "built-in" might then mean "we can't prove it's not a built-in". It's probably not worth to introduce a modifier CouldBeEitherWay just to explicitly express ambiguity ;)
> > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> 
> In my mind, "go-to-definition on this operator symbol will take me to a function declaration/definition" is a good match for "I want this colored differently". (Which would imply treating dependent operator calls where we can't figure out an overloaded operator target even heuristically, as "built-in".)
> Can you give some background here or on the bug tracker about what kind of distinction you're trying to draw here and why it's important?
> (Most clients are never going to benefit from nonstandard modifiers so they should be pretty compelling)

This was one of the biggest questions I had about this patch - just hoping it doesn't get missed.


================
Comment at: clang-tools-extra/clangd/SemanticHighlighting.h:76
   ConstructorOrDestructor,
+  UserProvided,
 
----------------
sammccall wrote:
> nridge wrote:
> > ckandeler wrote:
> > > ckandeler wrote:
> > > > sammccall wrote:
> > > > > sammccall wrote:
> > > > > > sammccall wrote:
> > > > > > > Can you give some background here or on the bug tracker about what kind of distinction you're trying to draw here and why it's important?
> > > > > > > (Most clients are never going to benefit from nonstandard modifiers so they should be pretty compelling)
> > > > > > as well as being jargony, "user-provided" has a specific technical meaning that I don't think you intend here. For example, `auto operator<=>(const S&) const = default` is not user-provided (defaulted on first declaration). https://eel.is/c++draft/dcl.fct.def.default#5
> > > > > > 
> > > > > > If we need this and can't get away with reusing `defaultLibrary` (which would include `std::`) then maybe we should add something like `builtin` which seems quite reusable.
> > > > > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > > > > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> > > > > as well as being jargony, "user-provided" has a specific technical meaning that I don't think you intend here. For example, `auto operator<=>(const S&) const = default` is not user-provided (defaulted on first declaration). https://eel.is/c++draft/dcl.fct.def.default#5
> > > > > 
> > > > > If we need this and can't get away with reusing `defaultLibrary` (which would include `std::`) then maybe we should add something like `builtin` which seems quite reusable.
> > > > 
> > > > I use "userProvided" here in the sense of "not built-in" or "overloaded". I do not cling to the term, and have also suggested the opposite default of using "builtin" in another comment.
> > > > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > > 
> > > True, I have not considered this.
> > > 
> > > > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> > > 
> > > Intuitively, it seems we should be conservative and not claim the operator is overloaded unless we know it is. So "built-in" might then mean "we can't prove it's not a built-in". It's probably not worth to introduce a modifier CouldBeEitherWay just to explicitly express ambiguity ;)
> > > Since we often can't say whether an operator is user-provided or not (in templates), we should consider what we want the highlighting to be in these cases.
> > > (If templates should be highlighted as built-in, we should prefer a modifier like `UserProvided`, if they should be highlighted as user-provided, we should prefer a modifier like `Builtin`)
> > 
> > In my mind, "go-to-definition on this operator symbol will take me to a function declaration/definition" is a good match for "I want this colored differently". (Which would imply treating dependent operator calls where we can't figure out an overloaded operator target even heuristically, as "built-in".)
> > Can you give some background here or on the bug tracker about what kind of distinction you're trying to draw here and why it's important?
> > (Most clients are never going to benefit from nonstandard modifiers so they should be pretty compelling)
> 
> This was one of the biggest questions I had about this patch - just hoping it doesn't get missed.
> Intuitively, it seems we should be conservative and not claim the operator is overloaded unless we know it is. 

This feels a bit circular, if we agree we're not going to introduce a `CouldBeEitherWay` then why is "built-in" a more conservative claim than "overloaded"?

I'm inclined towards `builtin` as a modifier because I think for language entities as a whole (types, functions etc, not just operators) it's the exception. It also seems easier to name and define.

> In my mind, "go-to-definition on this operator symbol will take me to a function declaration/definition" is a good match for "I want this colored differently".

This mostly makes sense to me, but:
 - I don't think we should actually run all the heuristics logic
 - if there's probably a definition available but we can't resolve it due to templates, I'd still like to know something's up

I think my internal question is more like "is this a trivial arithmetic shift, or something potentially complicated"? And I think depending on template resolution is "potentially complicated". (Maybe trivial in the end, but so might be an overloaded operator)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136594/new/

https://reviews.llvm.org/D136594



More information about the cfe-commits mailing list