[PATCH] D76094: [clangd] Change line break behaviour for hoverinfo

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Mar 23 09:15:13 PDT 2020


sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Thanks for this! Mostly just some cleanups/comment nits left.

The substantive change is around the `\` and `  ` at end-of-line. It looks like the goal is on improving the parsing of markdown-formatted comments, but we need to focus as well on non-markdown-formatted comments as these are the majority case. (Most of your patch improves these a lot!)

If you don't have commit access yet, I'm happy to land this for you, just let me know when ready.
(A couple of patches like this are good grounds to apply for commit access, you can just link to the review threads)



================
Comment at: clang-tools-extra/clangd/Hover.cpp:530
+
+  return NextNonSpaceCharIndex != llvm::StringRef::npos &&
+         Str[NextNonSpaceCharIndex] == '\n';
----------------
We can make use of some StringRef helpers to make this more readable.
I think this is equivalent (to the whole function body):

```
return Str.substr(LineBreakIndex + 1).drop_while(isWhitespace).startswith("\n");
```


================
Comment at: clang-tools-extra/clangd/Hover.cpp:534
+
+bool isMardkownLineBreak(llvm::StringRef Str, size_t LineBreakIndex) {
+  return LineBreakIndex > 1 &&
----------------
I'm not altogether happy with these rules: many comments are not markdown, and these don't seem very safe for non-markdown comments.

Comments ending in a trailing slash may be explicitly indicating a **soft** line break e.g.

```
// Example usage:
// my_prog --some_flag = foo \
//    --remote_database=/path/to/whatever \
//    input.txt
```

and trailing whitespace is often just sloppiness.

I did a search over a large corpus of open-source C++.
 - for `//.*\ \ $`: Of a random 10 results, none appear to be markdown. 
 - for `//.*\\$`: Of a random 10 results, none were markdown. (4 were explicit soft breaks, 4 were ascii art, 2 were soft breaks in commented-out macro definitions)


================
Comment at: clang-tools-extra/clangd/Hover.cpp:535
+bool isMardkownLineBreak(llvm::StringRef Str, size_t LineBreakIndex) {
+  return LineBreakIndex > 1 &&
+         (Str[LineBreakIndex - 1] == '\\' ||
----------------
What about:
```
StringRef Prev = Str.substr(0, LineBreakIndex);
return Prev.endswith("\\") || Prev.endswith("  ")
```

(this seems more readable, and fixes the bug if LineBreakIndex == 1 and the string is {backslash, newline, ...}.)


================
Comment at: clang-tools-extra/clangd/Hover.cpp:544
+  return LineBreakIndex > 0 &&
+         Punctuation.find(Str[LineBreakIndex - 1]) != llvm::StringRef::npos;
+};
----------------
nit: Punctuation.contains(...)


================
Comment at: clang-tools-extra/clangd/Hover.cpp:733
 
+// Tries to retain hard line breaks and drops soft line breaks.
+void parseLineBreaks(llvm::StringRef Input, markup::Document &Output) {
----------------
nit: please put the doc on the declaration rather than the definition.

(I think the previous name was more appropriate to the signature, but up to you)


================
Comment at: clang-tools-extra/clangd/Hover.cpp:750
+
+      if (isParagraphLineBreak(TrimmedInput, CharIndex)) {
+        // TODO this should add an actual paragraph (double linebreak)
----------------
rather than repeating the code twice, please just use if isParagraph() || isHard(), with the comment that we may treat paragraph differently


================
Comment at: clang-tools-extra/clangd/Hover.cpp:751
+      if (isParagraphLineBreak(TrimmedInput, CharIndex)) {
+        // TODO this should add an actual paragraph (double linebreak)
+        Output.addParagraph().appendText(CurrentLine);
----------------
nit: llvm tends to use "// FIXME: "


================
Comment at: clang-tools-extra/clangd/Hover.cpp:751
+      if (isParagraphLineBreak(TrimmedInput, CharIndex)) {
+        // TODO this should add an actual paragraph (double linebreak)
+        Output.addParagraph().appendText(CurrentLine);
----------------
sammccall wrote:
> nit: llvm tends to use "// FIXME: "
please soften this to "maybe".
(Because we need to consider the padding/css issues of doing so, whether a hard break vs double break reflects different intents or just different conventions in different codebases, and whether we have a reasonable way to implement this - paragraphs have leading as well as trailing padding, so it gets complicated fast).

and rephrase to "should distinguish between line breaks and paragraphs".
("Double linebreak" is confusing as it doesn't mean anything at the markup::Document level, and markdown paragraph isn't the same concept as a plaintext blank line, even if they are both encoded as "\n\n")


================
Comment at: clang-tools-extra/clangd/unittests/HoverTests.cpp:1891
+    llvm::StringRef ExpectedRenderPlainText;
+  } Cases[] = {{
+                   "foo\n\n\nbar",
----------------
please add tests starting/ending with newline as these are special cases in the code


================
Comment at: clang-tools-extra/clangd/unittests/HoverTests.cpp:1943
+                   "foo\n*bar",
+                   // TODO `*` should probably not be escaped after line break
+                   "foo  \n\\*bar",
----------------
please drop these comments, as they don't seem to be true (unless assuming the content is markdown, which it usually isn't)

I think we can drop most of the tests too (for specific characters after newline), as we already have escaping tests and these don't actually test different linebreaking behaviour.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76094/new/

https://reviews.llvm.org/D76094





More information about the cfe-commits mailing list