[PATCH] D65526: [Clangd] Initial prototype version of ExtractFunction

Tue Aug 20 04:11:35 PDT 2019

sammccall added a comment.

Main themes are:

- there are a few places we can reduce scope for the first patch: e.g. template support
- internally, it'd be clearer to pass data around in structs and avoid complex classes unless the abstraction is important
- high-level documentation would aid in understanding: for each concept/function: what is the goal, how does it fit into higher-level goals, are there any non-obvious reasons it's done this way
- naming questions (this can solve some of the same problems as docs, but much more cheaply)

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:7
+//
+//===----------------------------------------------------------------------===//
+#include "ClangdUnit.h"
----------------
this file will need an overview describing
 - the refactoring, and (briefly) major user-visible design decisions (e.g. params, return types, hoisting)
 - structure/approach/major parts

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:37
+
+// Source is the part of code that is being extracted.
+// EnclosingFunction is the function/method inside which the source lies.
----------------
this name is *also* pretty confusing because of `SourceLocation`, `SourceRange` etc classes.

Best option might just be to make up something distinctive, like `Zone` :-/

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:39
+// EnclosingFunction is the function/method inside which the source lies.
+// PreSource is everything before source and part of EnclosingFunction.
+// PostSource is everything after source and part of EnclosingFunction.
----------------
nit: if you're going to document the values one-by-one, do it in a comment on each value instead

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:42
+// We split the file into 4 parts w.r.t. the Source.
+enum LocType { PRESOURCE, INSOURCE, POSTSOURCE, OUTSIDEFUNC };
+
----------------
nit: we don't use MACROCASE for enums in LLVM (I think). This of course makes namespacing more important, we should consider `enum class` here.

I'd probably suggest
```
enum class ZoneRelative { // or some name consistent with above
  Before,
  After,
  Inside,
  External
};

```

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:45
+// ExtractionSourceView forms a view of the code wrt to Source.
+class ExtractionSourceView {
+public:
----------------
This looks like basically a struct (mostly public data members) where all the logic is in the constructor.
This would be more obvious/more consistent with style elsewhere if it were literally a struct and the constructor was instead a function (which could return `Expected<ExtractionSourceView>` instead of setting flags on the result)

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:45
+// ExtractionSourceView forms a view of the code wrt to Source.
+class ExtractionSourceView {
+public:
----------------
sammccall wrote:
> This looks like basically a struct (mostly public data members) where all the logic is in the constructor.
> This would be more obvious/more consistent with style elsewhere if it were literally a struct and the constructor was instead a function (which could return `Expected<ExtractionSourceView>` instead of setting flags on the result)
nit: not sure `view` adds anything here. `ExtractionSource` or `ExtractionZone`?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:50
+  // Get the location type for a given location.
+  LocType getLocType(SourceLocation Loc) const;
+  // Parent of RootStatements being extracted.
----------------
classify?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:59
+  SourceRange EnclosingFunctionRng;
+  const Node *LastRootStmt = nullptr;
+  bool isEligibleForExtraction() const { return IsEligibleForExtraction; }
----------------
when is this not Parent->Children.back()?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:93
+  for (const Node *CurNode = CommonAnc; CurNode; CurNode = CurNode->Parent) {
+    if (CurNode->ASTNode.get<FunctionDecl>()) {
+      // FIXME: Support extraction from methods.
----------------
I think you probably want to avoid extracting from lambdas and templates for now, too.

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:104
+SourceRange
+ExtractionSourceView::computeEnclosingFunctionRng(const Node *EnclosingFunction,
+                                                  const SourceManager &SM,
----------------
nit: Please spell 'range' rather than 'rng' in function names, as it's not a standard abbreviation in clang[d]

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:107
+                                                  const LangOptions &LangOpts) {
+  const Node *N = EnclosingFunction->Parent->ASTNode.get<FunctionTemplateDecl>()
+                      ? EnclosingFunction->Parent
----------------
can we avoid adding template support in the first patch? It's complicated, and I don't think the code below handles it robustly.

For instance, this case will be extracted (despite being dependent), because the reference to T is not seen by the ast visitor, and the type of the decls that are seen are not dependent:
```
template <typename T>
int zero() { return std::vector<T>().size(); }
```

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:113
+
+const Node *ExtractionSourceView::getParentOfRootStmts(const Node *CommonAnc) {
+  if (!CommonAnc)
----------------
It's not clear what this function does and why. Can you add a high-level comment?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:124
+      return nullptr;
+    [[clang::fallthrough]];
+  case SelectionTree::Selection::Complete:
----------------
LLVM_FALLTHROUGH

(we support other host compilers: msvc, gcc)

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:144
+    return;
+  // FIXME: check if EnclosingFunction has any attributes
+  EnclosingFunction = computeEnclosingFunction(Parent);
----------------
what needs to be fixed, and why is it important?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:175
+
+bool ExtractionSourceView::hasOnlyRootStmtChildren() {
+  for (const Node *Child : Parent->Children) {
----------------
what is this function doing, and why?
why is it separated from getParentOfRootStmts?

it seems like we're trying to determine that the selection consists of either a single completely-selected stmt, or a bunch of completely-selected statements with a common parent, and then we represent them as "all the children of a certain parent node".

Splitting the logic into two distant functions, and not documenting this intent, makes it hard to follow.

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:206
+// rendering it.
+class NewFunction {
+private:
----------------
as above, this seems like it would be clearer as a struct.

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:214
+    bool IsConst;
+    Parameter(std::string Name, std::string Type, bool IsConst)
+        : Name(Name), Type(Type), IsConst(IsConst) {}
----------------
I'd suggest capturing the type as a QualType instead of a string + const + ref flag

When types may need to be re-spelled, we'll need that extra information - the context needed to re-spell is available at render() time, not addParameter() time.

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:225
+  std::string ReturnType = "void";
+  std::set<Parameter> Parameters;
+  std::vector<const Decl *> DeclsToHoist;
----------------
We tend to avoid std::set except in the rare cases where we need both sorting and on-the-fly deduplication.

Here, it seems std::vector would be enough, and explicitly sorting an the end would make the intent clearer.

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:226
+  std::set<Parameter> Parameters;
+  std::vector<const Decl *> DeclsToHoist;
+  SourceRange BodyRange;
----------------
Hoisting isn't supported, drop this?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:235
+        SemicolonPolicy(SemicolonPolicy){};
+  std::string render(bool IsDefinition) const;
+  std::string getFuncBody() const;
----------------
as previously discussed, there's not enough (conceptual or implementation) commonality between the call and the declaration, can we have separate renderDefinition() and renderCall() functions?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:262
+std::string NewFunction::getFuncBody() const {
+  // TODO: Hoist Decls
+  // TODO: Add return statement.
----------------
these TODOs don't really describe the structural change that's needed here: we need a way to apply a bunch of edits to the function text (in terms of the original coordinates).

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:400
+// to be hoisted. Returns true if able to find the parameters successfully.
+bool createParametersAndGetDeclsToHoist(
+    NewFunction &ExtractedFunc, const CapturedSourceInfo &CapturedInfo) {
----------------
I'm confused about the decls to host part - this isn't actually supported right? Take it out of the name?

Or are they detected in order to cause the refactoring to fail?

================
Comment at: clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp:454
+
+// Get semicolon extraction policy (may change SrcRange in SrcView)
+tooling::ExtractionSemicolonPolicy
----------------
you've got a return type, a comment, and a function name, and they all say the same thing. Can you explain (at a high level) what this is for?

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:603
+TWEAK_TEST(ExtractFunction);
+TEST_F(ExtractFunctionTest, PrepareFunctionTest) {
+  Header = R"cpp(
----------------
the test helpers are really aimed at testing prepare/apply together rather than separately.

These EXPECT_AVAILABLE could be EXPECT_EQ(apply(), ...). If prepare() returns false it'll return "prepare failed" or some string like that.

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:618
+  // TODO: Add tests for macros after selectionTree works properly for macros.
+  // EXPECT_AVAILABLE(R"cpp( F (int x = 0; [[x = 1;]])cpp");
+  // FIXME: This should be unavailable since partially selected but
----------------
rather than checking in commented-out tests, I'd prefer to check in the tests verifying current incorrect behavior with a FIXME.
Commented out code tends to rot (or often not quite work in the first place), and it's not discoverable

(or just leave the FIXME by itself)

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:639
+      void f() {
+        [[int x;]]
+      }
----------------
maybe add a comment saying why not?

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:649
+  // Checks const qualifier and extraction parameters.
+  Header = R"cpp(
+      )cpp";
----------------
this line does nothing?

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:651
+      )cpp";
+  EXPECT_EQ(apply(
+                R"cpp(
----------------
IIRC you can't use raw strings inside a macro (it's legal, but one of the supported compilers chokes on it)

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:654
+struct FOO {
+  int x;
+};
----------------
please avoid large tests that test lots of things, test them one-at-a-time instead. The intent is clearer, and it's clearer what failures mean.

================
Comment at: clang-tools-extra/clangd/unittests/TweakTests.cpp:683
+})cpp");
+  auto ShouldSucceed = [&](llvm::StringRef Code) {
+    EXPECT_THAT(apply(Code), HasSubstr("extracted"));
----------------
please write these out explicitly instead.
Among other things, the goal in the helpers was that assertion failures should preserve line numbers of examples

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65526/new/

https://reviews.llvm.org/D65526