[PATCH] D150635: [clangd] Implement end-definition-comment inlay hints

Wed May 24 20:17:52 PDT 2023

daiyousei-qz added a comment.

Addressed review comments except for those we don't have an agreement yet.

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:578
+    constexpr unsigned HintMinLineLimit = 2;
+    constexpr unsigned HintMaxLengthLimit = 50;
+
----------------
sammccall wrote:
> We actually already have a configurable inlay hint length limit.
> It has the wrong name (`TypeNameLimit`) , but I think we should use it anyway (and maybe try to fix the name later).
> 
> (I'm not sure making this configurable was a great idea, but given that it exists, people will expect it to effectively limit the length of hints)
`TypeNameLimit` was introduced to limit the length of "DeducedTypes" only. I don't think using that number for all hints is a good idea because different hints appear in different contexts. For deduced type hint, it is displayed in the middle of the actual code, so it must be concise to not overwhelm the actual code. However, this hint is usually displayed at the end of an almost empty line. A much larger number of characters should be allowed.

(I'm not arguing here, but IMO such options are definitely helpful. Not everybody would fork llvm-project and build their own version of clangd binary. Without options to configure length of hints, people would disable "DeducedTypes" entirely if they cannot tolerate the default length limit, while this feature is definitely a cool thing to turn on. Personally, I actually think clangd has too little option compared to say rust-analyzer. But it's just my understanding)

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:597
+    auto BlockBeginLine =
+        SM.getSpellingLineNumber(BraceRange.getBegin(), &Invalid);
+    if (Invalid)
----------------
sammccall wrote:
> this should be getFileLoc(BraceRange.getBegin()), I think?
Thanks for pointing out! As per comments below, calling `getLineNumber` instead of `getSpellingLineNumber` in the latest version.

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:783
+    // Note this range doesn't include the trailing ';' in type definitions.
+    // So we have to add SkippedChars to the end character.
+    SourceRange R = D.getSourceRange();
----------------
sammccall wrote:
> daiyousei-qz wrote:
> > sammccall wrote:
> > > This is too much arithmetic and fiddly coupling between this function and shouldHintEndDefinitionComment.
> > > 
> > > Among other things, it allows for confusion between unicode characters (UTF-32), clang bytes (UTF-8), and LSP characters (usually UTF-16). And we do have a bug here: shouldHintEndDefinition provides SkippedChars in clang bytes, but you add it to end.character below which is in LSP characters.
> > > 
> > > ---
> > > 
> > > Let's redesign a little... 
> > > 
> > > We have a `}` on some line. We want to compute a sensible part of that line to attach to.
> > > A suitable range may not exist, in which case we're going to omit the hint.
> > > 
> > > - The line consists of text we don't care about , the `}`, and then some mix of whitespace, "trivial" punctuation, and "nontrivial" chars.
> > > - the range should always start at the `}`, since that's what we're really hinting
> > > - to get the hint in the right place, the range should end after the trivial chars, but before any trailing whitespace
> > > - if there are any nontrivial chars, there's no suitable range
> > > 
> > > so something like:
> > > 
> > > ```
> > > optional<Range> findBraceTargetRange(SourceLocation CloseBrace) {
> > >   auto [File, Offset] = SM.getDecomposedLoc(SM.getFileLoc(CloseBrace));
> > >   if (File != MainFileID) return std::nullopt;
> > >   StringRef RestOfLine = MainFileBuf.substr(Offset).split('\n').first.rtrim();
> > >   if (!RestOfLine.consume_front("{")) return;
> > >   if (StringRef::npos != Punctuation.find_first_of(" ,;")) return std::nullopt;
> > >   return {offsetToPosition(MainFileBuf, Offset), offsetToPosition(MainFileBuf, Result.bytes_end() - MainFileBuf.bytes_end()};
> > > }
> > > ```
> > > 
> > > and then call from addEndDefinitionComment, the result is LSPRange already.
> > Done, also moved min-line and max-length logic to this function. Btw, I think we should avoid `offsetToPosition` as much as possible. It is a large overhead considering inlay hints are recomputed throughout the entire file for each edit. I frequently work with source code that's nearly 1MB large (Yes, I don't think we should have source file this large, but it is what it is).
> > also moved min-line and max-length logic to this function
> 
> I'd prefer you to move them back. As specified, that function is otherwise strictly about examining the textual source code around the closing brace. Now it's muddled: it also looks at the opening brace, and does lookups into line tables.
> 
> You seem to be aiming to optimize an operation that you think is expensive, but I don't see any evidence (measurements) that this implementation is actually faster or that the difference matters.
> 
> > Btw, I think we should avoid offsetToPosition as much as possible
> 
> I understand the concern: computing inlay hints is quadratic. However things are *universally* done this way in clangd, and diverging here won't significantly improve performance but makes the code much harder to understand in context.
> 
> In practice we haven't found places where overheads that are length(file)*length(LSP response) are significant compared to parse times. If you have such examples, it would be great to gather & share profiles and propose a solution.
> I suspect we could maintain some line lookup data structure that gets updated when source code updates come in over LSP, or something. But micro-optimizing individual codepaths isn't going to get us there: if two offsetToPosition() calls are bad, then one is still bad.
> I'd prefer you to move them back. As specified, that function is otherwise strictly about examining the textual source code around the closing brace. Now it's muddled: it also looks at the opening brace, and does lookups into line tables.

Moved max-length logic back, but kept the block line number check there because the line number of RBrace is readily available in this function. I don't see why we need to duplicate code to obtain that line number.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150635/new/

https://reviews.llvm.org/D150635