[PATCH] D56611: [clangd] A code action to swap branches of an if statement

Tue Jan 22 07:13:39 PST 2019

ilya-biryukov marked an inline comment as done.
ilya-biryukov added inline comments.

================
Comment at: clangd/refactor/tweaks/SwapIfBranches.cpp:34
+/// After:
+///   if (foo) { continue; } else { return 10; }
+class SwapIfBranches : public Tweak {
----------------
sammccall wrote:
> The before/after is useful, we should probably have it for all tweaks if possible.
> It'd also be useful to notate where the cursor can be to trigger the action. Partly because it forces us to consider this!
> 
> e.g. (not sure if this actually matches the logic you want, just an example)
> ```
> Before:
>   if (foo) { return 10; } else { continue; }
>   ^^^^^^^^^^            ^^^^^^^^           ^
> After:
>   ...
> ```
LG, updated the comment and ranges per suggestion.
I still not sure if the action should span the whole condition, but going with the suggested ranges anyway. We can tweak them later if the resulting ranges turn out to be problematic.

However, note that I excluded `{` and `}` from available ranges, but only because I'm lazy and this requires declaring some more local vars, etc.

================
Comment at: clangd/refactor/tweaks/SwapIfBranches.cpp:46
+private:
+  tooling::Replacements Result;
+};
----------------
sammccall wrote:
> I think prepare() should just verify:
>  - cursor is in the right place
>  - else statement exists and isn't an else if
>  - both branches are compound statements (for now)
>  - (maybe) relevant locations (if keyword, else keyword, braces) are not macro expansions
> and then record the relevant source locations (or just the IfStmt*)
> 
> We may be able to get away with doing all the work in `prepare()`, but it's not a good model for future tweaks. (And it is at least somewhat wasteful on a hot path).
Done.
The only reason I thought applying replacements was a sensible idea is because it would fail in some weird macro cases (only when resulting ranges overlap, though). And it seemed better to avoid showing the action rather than showing it and failing later.

================
Comment at: clangd/refactor/tweaks/SwapIfBranches.cpp:50
+
+class FindIfUnderCursor : public RecursiveASTVisitor<FindIfUnderCursor> {
+public:
----------------
sammccall wrote:
> (Mostly this is food for thought for later - we shouldn't try to solve everything in this patch)
> 
> Two efficiency problems here:
>  - doing one traversal per tweak is wasteful (when we have >1 tweaks, but worth at least *thinking* about)
>  - traversing the entire AST rather than just the nodes over the cursor is bad news (though "over the cursor" may not be trivial to define)
> 
> Idea for how to "framework" this problem away:
> Add `vector<DynTypedNode> SelectedNodes` to the inputs, the idea being that this is the stack from the narrowest `Expr` under the cursor to the `TranslationUnitDecl` at the top.
> This can be produced by a RecursiveASTVisitor (that either traverses the whole AST, or just the bits under the cursor if that's simple enough).
> Tweaks would iterate over the nodes from narrowest to broadest, deciding whether to select this node for processing, continue iterating, or bail out.
> 
> Matching in checks are then pretty easy to write, we haven't removed too much flexibility in flow control, and it's pretty hard to write a slow check.
> 
> There are some complications:
>  - we need access to parents somehow (e.g. consider the case of adding NS qualifiers, we want to match on names but then traverse up to the containing declrefexpr to get the nested namespace loc)
>  - the interesting nodes may be a tree rather than a stack, because nodes overlap in the source. We could store a postorder traversal... but this makes e.g. "bail out when you hit a functiondecl" harder - you probably want to bail out of the current branch only.
>  - things get complicated when we think about macros - depending on the semantics we want, it may be hard for the framework to prune parts of the tree that aren't under the cursor withouth traversing the whole AST.
I mostly agree, however I still wonder whether that would be inefficient in some cases. E.g. consider a corner if the input selection from LSP contains the whole file? I see two options there: (1) putting all nodes of a file into `vector<DynTypedNode>`, (2) putting only the top-level `TranslationUnitDecl` there.

It seems easy to end up with (1) to make the interface useful, however we probably prefer (2) because otherwise we're back into the worst-case scenario, i.e. "every  tweak traverses all the nodes"

> we need access to parents somehow (e.g. consider the case of adding NS qualifiers, we want to match on names but then traverse up to the containing declrefexpr to get the nested namespace loc)

I wonder if it's possible to instead write the checks so that they only look at the children of the nodes in a vector? My bet is that almost all of the checks only need to look one or two levels down, so this shouldn't turn into inefficiency.

I initially thought about providing a very limited set of AST nodes as inputs for the actions that are computed efficiently. Specifically, I thought about having:

```
struct Selection {
  Decl *DeclUnderCursor; // the innermost declaration under CursorLoc.
  Stmt *StmtUnderCursor; // the innermost statement under CursorLoc.
  Expr *ExprUnderCursor; // the innermost expression under CursorLoc.

  // For some tweaks (extract var, extract method) we will also need these:
  Expr *SelectedExpression; // expression exactly matching selection.
  vector<Stmt*> SelectedStatements; // a list of statements exactly matching 
selection.
};
```

The checks would only look at those few statements, e.g. "Swap If" would try `dyn_cast_or_null<Stmt*>(S.StmtUnderCursor)` and do the same checks for cursor locations.

We could expand the set of inputs as needed. This would mean that we don't have to design the framework, but that also mean writing some checks would involve some work in the shared code that computes these inputs.

I bet that complexity-wise we would end up with simpler checks and more complicated shared code to compute the inputs. OTOH, we won't need to struggle with the limitations of the AST to design the framework that would properly handle all interesting cases without doing extra work.

================
Comment at: clangd/refactor/tweaks/SwapIfBranches.cpp:84
+  // to avoid changing semantics of the code (i.e. dangling else).
+  if (!llvm::dyn_cast_or_null<CompoundStmt>(If->getThen()) ||
+      !llvm::dyn_cast_or_null<CompoundStmt>(If->getElse()))
----------------
sammccall wrote:
> isa<CompoundStmt>
Done.
I used `dyn_cast_or_null`, because `isa` expectes the arguments to be non-null, therefore we need an extra check before calling it. So the former looked better to me, but I'm actually ok either way.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56611/new/

https://reviews.llvm.org/D56611