[PATCH] D150635: [clangd] Implement end-definition-comment inlay hints

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 17 06:41:42 PDT 2023


sammccall added a comment.

In D150635#4348702 <https://reviews.llvm.org/D150635#4348702>, @daiyousei-qz wrote:

> Regarding to **Naming**, I agree that `EndDefinitionComment` sounds so tedious but I couldn't come up with a better name. However, I think `EndBlock` also feel a little strange to me as a public name. If I come across a setting key in `InlayHints` section named `EndBlock`, honestly I would be confused by what it would do ("end" could be a verb, and "block" usually make me to think of the local scope block). What about `AfterBlock` or `AfterDefinition`?

Agreed that "End" looks too much like a verb here. "After" isn't quite right, again because we're trying to describe the thing we're hinting, not where the hint is position.
I prefer "Block" over "Definition" exactly because this sort of hint should also be added to local scope blocks (e.g. `if` bodies) later.

Does `BlockEnd` address your concern? ("Block" can be a verb but I think there's no risk of confusion here)

> Regarding to **Configuration**, I suppose the minimum line number of definition is a very natural parameter for this feature. I doubt we could come up with some good heuristic to conditionally show this hint. Considering how straightforward this parameter its implementation is, I'd prefer to having this around if possible. But I'll let you to decide.

Right: It's tempting to make it an option, because we don't know what the value should be, and an option is easy to implement.

But these are not **good** reasons to make it an option:

- an option doesn't substantially change the importance of picking a good default heuristic, because 95+% of people will use it anyway. If we can't find a good default, we shouldn't implement the feature.
- the vast majority of the cost of an option is maintenance and user-facing complexity, rather than implementation



================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:278
+    if (Cfg.InlayHints.EndDefinitionComments &&
+        D->isThisDeclarationADefinition()) {
+      addEndDefinitionCommentHint(*D, "", false);
----------------
I think this is a wrong/incomplete check, `e.g. Foo(const Foo&) = default` is a definition but doesn't have a brace range.

Maybe it's less redundant just to try to get the brace range and then check if it's valid.


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:777
+  void addEndDefinitionCommentHint(const NamedDecl &D, StringRef DeclPrefix,
+                                   bool SkipSemicolon) {
+    size_t SkippedChars = 0;
----------------
SkipSemicolon doesn't need to be optional, a trailing semicolon never adds any ambiguity about what the hint is for


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:783
+    // Note this range doesn't include the trailing ';' in type definitions.
+    // So we have to add SkippedChars to the end character.
+    SourceRange R = D.getSourceRange();
----------------
This is too much arithmetic and fiddly coupling between this function and shouldHintEndDefinitionComment.

Among other things, it allows for confusion between unicode characters (UTF-32), clang bytes (UTF-8), and LSP characters (usually UTF-16). And we do have a bug here: shouldHintEndDefinition provides SkippedChars in clang bytes, but you add it to end.character below which is in LSP characters.

---

Let's redesign a little... 

We have a `}` on some line. We want to compute a sensible part of that line to attach to.
A suitable range may not exist, in which case we're going to omit the hint.

- The line consists of text we don't care about , the `}`, and then some mix of whitespace, "trivial" punctuation, and "nontrivial" chars.
- the range should always start at the `}`, since that's what we're really hinting
- to get the hint in the right place, the range should end after the trivial chars, but before any trailing whitespace
- if there are any nontrivial chars, there's no suitable range

so something like:

```
optional<Range> findBraceTargetRange(SourceLocation CloseBrace) {
  auto [File, Offset] = SM.getDecomposedLoc(SM.getFileLoc(CloseBrace));
  if (File != MainFileID) return std::nullopt;
  StringRef RestOfLine = MainFileBuf.substr(Offset).split('\n').first.rtrim();
  if (!RestOfLine.consume_front("{")) return;
  if (StringRef::npos != Punctuation.find_first_of(" ,;")) return std::nullopt;
  return {offsetToPosition(MainFileBuf, Offset), offsetToPosition(MainFileBuf, Result.bytes_end() - MainFileBuf.bytes_end()};
}
```

and then call from addEndDefinitionComment, the result is LSPRange already.


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:292
+
+  bool VisitRecordDecl(RecordDecl *D) {
+    if (Cfg.InlayHints.EndDefinitionComments &&
----------------
sammccall wrote:
> Not sure why if need to handle this + EnumDecl, can we just handle VisitTagDecl to cover both?
This is marked done but isn't done.
I'd suggest something like

```
VisitTagDecl(*D) {
  if (...) {
    string Kind = D->getKindName();
    if (auto *ED = dyn_cast<EnumDecl>(D); ED->isScoped())
     S += ED->isScopedUsingClassTag() ? " class" : " struct";
    addEndDefinitionCommentHint(*D, Kind);
  }
}
```


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:763
+    if (!LSPRange ||
+        static_cast<uint32_t>(LSPRange->end.line - LSPRange->start.line) + 1 <
+            Cfg.InlayHints.EndDefinitionCommentMinLines)
----------------
daiyousei-qz wrote:
> sammccall wrote:
> > I'm not sure we're using the right range here. Consider:
> > 
> > ```
> > const some::long<type, name>* function
> >    many params, therefore spanning, lines) {}
> > ```
> > 
> > I don't think hinting `}` here is useful: the relevant distance is from `{` rather than the start of the declaration.
> I agree. What do you think is a good way get the range of the definition block? What I can think of is to walking backwards the expanded token buffer from the ending '}' and searching for the balanced '}'.
> What do you think is a good way get the range of the definition block?

For class etc: TagDecl::getBraceRange()
For function: getBody()->get{L,R}BracLoc().
For namespace, I don't see a way in the AST. I think it's fine to use the full range, as the `namespace` and `{` will almost always be on the same line.

These ranges can be extracted from the specific types in the Visit* methods and passed into addEndDefinitionCommentHint



================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:771
+    if (const auto *FD = dyn_cast_or_null<FunctionDecl>(&D)) {
+      Label = printName(AST, D);
+    } else {
----------------
daiyousei-qz wrote:
> sammccall wrote:
> > why is printName OK here and not below?
> I'm using printName here to print names for operator overload. Do you have any advice?
only to be consistent: either print in the same way here vs below, or document why you're doing different things.

(From what's written in the code, it's not clear what the downside is to printName such that we don't use it everywhere)


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:771
+    if (const auto *FD = dyn_cast_or_null<FunctionDecl>(&D)) {
+      Label = printName(AST, D);
+    } else {
----------------
sammccall wrote:
> daiyousei-qz wrote:
> > sammccall wrote:
> > > why is printName OK here and not below?
> > I'm using printName here to print names for operator overload. Do you have any advice?
> only to be consistent: either print in the same way here vs below, or document why you're doing different things.
> 
> (From what's written in the code, it's not clear what the downside is to printName such that we don't use it everywhere)
I would print the DeclarationName only, probably
and try to avoid doing different things for different types of decl


================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:772
+      Label = printName(AST, D);
+    } else {
+      // We handle type and namespace decls together.
----------------
sammccall wrote:
> zyounan wrote:
> > Question: Should we restrict the type here? What will happen if `D` was something unexpected like a `TypedefDecl`? Would it fall into the logic that makes `Label` result to `<anonymous>`?
> Yeah, this case-enumeration feels a bit fuzzy to me, like we had more structured information earlier and are now trying to recover it.
> 
> I'd like the signature of this function to be a bit more concrete, e.g. taking parameters:
> 
>  - `SourceRange Braces`
>  - `StringRef Kind` (e.g. "class")
>  - `DeclarationName Name` (we can overload/change the signature of getSimpleName)
> 
> This is also more similar to the other functions in this family.
> 
this is not quite done: we're still passing in the NamedDecl which offers the opportunity to go digging back into the class hierarchy. I think it's better to pass the name and source range in directly to make it clear we're not doing that.


================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:1612
+                           ExpectedHint{" // method3", "method3"},
+                           ExpectedHint{" // Test::method2", "method2"},
+                           ExpectedHint{" // Test::method4", "method4"});
----------------
I think we *probably* just want method2 here: showing qualifiers seems inconsistent with how we otherwise prefer brevity over disambiguation (e.g. not showing params etc)


================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:1566
+
+TEST(EndDefinitionHints, Functions) {
+  assertEndDefinitionHints(R"cpp(
----------------
daiyousei-qz wrote:
> sammccall wrote:
> > can you add testcases where:
> >  - the function name is an operator
> >  - the function is templated
> Added template function. I don't quite understand what you mean by "the function name is an operator". I'm assuming you meant operator overload so closing this one as covered below.
oops, indeed it is - can you move one of the operator tests into "methods" and add a non-method one to "functions"?
(Operators aren't really different enough to be separated out, I think)


================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:1576
+  )cpp",
+                           0, ExpectedHint{" /* foo */ ", "foo"},
+                           ExpectedHint{" /* bar */ ", "bar"});
----------------
daiyousei-qz wrote:
> sammccall wrote:
> > we're not adding the kind here as in "function foo".
> > 
> > Maybe that's OK, "function" is long, doesn't appear in the C++ syntax, and is mostly obvious.
> > But in that case I'd like to have some hint, maybe `/* foo() */`? (Not the actual param list of course, the parens would always be empty)
> > 
> > If you don't think we need any hint, then I'd question whether we need "class" etc for class hints!
> I think just `// foo` is good enough. Having no "qualifier" could be a privilege for functions since they are usually the majority of the source code. When reading through the source code, people will get that if no prefix is attached, then it's a function. (OFC they'll also see `// namespace` and .etc, but rare)
> 
> Honestly, IMO `// foo()` is worse than just the name. It could mislead me to think that this may be a complete signature while it is definitely not.
I'm skeptical, but let's give this a try, we can always add it later.


================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:1580
+
+TEST(EndDefinitionHints, Methods) {
+  assertEndDefinitionHints(R"cpp(
----------------
daiyousei-qz wrote:
> sammccall wrote:
> > can you add a test with a lambda?
> > I think the hint should probably just be `lambda` or `(lambda)`, this probably needs special handling.
> > 
> > It definitely shouldn't be `operator()` or `(lambda at somefile.cc:47)`.
> I would argue we don't need hint for a lambda. This hint is only added to long definitions that may introduce a symbol. It is helpful to highlight the boundaries of different declarations. Because of the implementation assumes the hint is added to a `NamedDecl`, lambda should be safe.
Up to you, I think a hint would be helpful but no hint is also fine.

In any case, please add a test showing the behavior.

> This hint is only added to long definitions that may introduce a symbol

FWIW I don't think this is the greatest value from the hint - rather that *when the scope changes* we get some extra context that the local syntax doesn't provide. (This is why I think lambdas, for loops etc should eventually be hinted)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150635/new/

https://reviews.llvm.org/D150635



More information about the cfe-commits mailing list