[PATCH] D88553: [clangd] Start using SyntaxTrees for folding ranges feature

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 20 05:48:41 PDT 2020


sammccall added inline comments.


================
Comment at: clang-tools-extra/clangd/SemanticSelection.cpp:39
   FoldingRange Range;
-  Range.startLine = Symbol.range.start.line;
-  Range.startCharacter = Symbol.range.start.character;
-  Range.endLine = Symbol.range.end.line;
-  Range.endCharacter = Symbol.range.end.character;
-  Result.push_back(Range);
-  for (const auto &Child : Symbol.children)
-    collectFoldingRanges(Child, Result);
+  Range.startCharacter = SM.getSpellingColumnNumber(SR.getBegin()) - 1;
+  Range.startLine = SM.getSpellingLineNumber(SR.getBegin()) - 1;
----------------
Have you considered how you want macro-expanded code to work?

As written, this code can produce ranges inside macro definitions, invalid ranges (if { and } aren't from the same macro expansion) or even crash (FirstToken->endLocation() may be an invalid one-past-the-end location).

My suggestion would be to have this return optional<FoldingRange> and simply bail out if the location are not file locations for now. (Later we can deal with macro args in some cases)


================
Comment at: clang-tools-extra/clangd/SemanticSelection.cpp:52-55
+    const syntax::Token *FirstToken = Tree->findFirstLeaf()->getToken(),
+                        *LastToken = Tree->findLastLeaf()->getToken();
+    assert(FirstToken->kind() == tok::TokenKind::l_brace);
+    assert(LastToken->kind() == tok::TokenKind::r_brace);
----------------
eduucaldas wrote:
> Take a look at `clang/include/clang/Tooling/Syntax/Nodes.h`, syntax constructs usually have nice classes with accessors.
> 
> For instance `CompoundStatement` has the accessors `getLbrace` and `getRbrace` that seem to be exactly what you want.
> 
> However these might not give exactly the first leaf and last leaf in the case of syntactically incorrect code.
I think we should treat all bracket-like things generically. Today this is just CompoundStmt, but we want to handle init lists, function calls, parens around for-loop conditions, template parameter and arg lists etc in the same way.

This sort of use is why the `OpenParen`/`CloseParen` NodeRoles are generic - we can have one set of logic to handle all of these. (No opinion on whether that should live here or in the syntax trees library, but putting it here for now seems fine).

So in the end I think checking the class name and then grabbing the braces by role (not kind) is the right thing here.
We definitely want to avoid asserting that the code looks the way we expect though.


================
Comment at: clang-tools-extra/clangd/SemanticSelection.cpp:56
+    assert(LastToken->kind() == tok::TokenKind::r_brace);
+    const SourceRange SR(FirstToken->endLocation(), LastToken->location());
+    FoldingRange Range = constructFoldingRange(SR, SM);
----------------
this is worthy of a comment (fold the entire range inside the brackets, including whitespace)


================
Comment at: clang-tools-extra/clangd/SemanticSelection.cpp:60
+    // nodes and newlines.
+    if (Tree->findFirstLeaf()->getNextSibling() != Tree->findLastLeaf() ||
+        Range.startLine != Range.endLine)
----------------
until we support FoldingRangeClientCapabilities, should we just have the line check?


================
Comment at: clang-tools-extra/clangd/unittests/SemanticSelectionTests.cpp:206
       R"cpp(
-        [[int global_variable]];
+        void func() {[[
+          int Variable = 100;
----------------
This is too many test cases for straightforward compoundstmt, I think. We have only once type of node handled for now, but we can't afford to write and maintain this many tests once we have lots.

I think a single if{}-elseif(nobraces)-else{} along with the existing function examples is probably enough. (Function examples show missing range for empty body, though it could use a comment)

We definitely need cases for macros:
 - entire {} within one macro arg
 - entire {} in the macro body
 - some combination (e.g. { in macro body, } outside the macro)
(we won't need these for every node type, assuming we have some consistent handling)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88553/new/

https://reviews.llvm.org/D88553



More information about the cfe-commits mailing list