[PATCH] D59376: [LibTooling] Add Transformer, a library for source-to-source transformations.

Wed Mar 27 12:19:52 PDT 2019

ilya-biryukov marked an inline comment as done.
ilya-biryukov added a comment.

Mostly nits and small API questions form my side.
This looks very good overall! I'm planning to take a closer look at the handling of macros and various AST nodes in more detail this week, but the approach looks solid from a higher level.

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:135
+// \endcode
+class RewriteRule {
+public:
----------------
ymandel wrote:
> ilya-biryukov wrote:
> > ymandel wrote:
> > > ilya-biryukov wrote:
> > > > ymandel wrote:
> > > > > ilya-biryukov wrote:
> > > > > > Maybe consider separating the fluent API to build the rewrite rule from the rewrite rule itself?
> > > > > > 
> > > > > > Not opposed to having the fluent builder API, this does look nice and seems to nicely align with the matcher APIs.
> > > > > > However, it is somewhat hard to figure out what can `RewriteRule` do **after** it was built when looking at the code right now.
> > > > > > ```
> > > > > > class RewriteRule {
> > > > > > public:
> > > > > >   RewriteRule(DynTypedMatcher, TextGenerator Replacement, TextGenerator Explanation);
> > > > > > 
> > > > > >   DynTypedMatcher matcher() const;
> > > > > >   Expected<string> replacement() const;
> > > > > >   Expected<string> explanation() const;
> > > > > > };
> > > > > > 
> > > > > > struct RewriteRuleBuilder { // Having a better name than 'Builder' would be nice.
> > > > > >   RewriteRule finish() &&; // produce the final RewriteRule.
> > > > > > 
> > > > > >   template <typename T>
> > > > > >   RewriteRuleBuilder &change(const TypedNodeId<T> &Target,
> > > > > >                       NodePart Part = NodePart::Node) &;
> > > > > >   RewriteRuleBuilder &replaceWith(TextGenerator Replacement) &;
> > > > > >   RewriteRuleBuilder &because(TextGenerator Explanation) &;
> > > > > > private:
> > > > > >   RewriteRule RuleInProgress;
> > > > > > };
> > > > > > RewriteRuleBuilder makeRewriteRule();
> > > > > > ```
> > > > > I see your point, but do you think it might be enough to improve the comments on the class? My concern with a builder is the mental burden on the user of another concept (the builder) and the need for an extra `.build()` everywhere. To a lesser extent, I also don't love the cost of an extra copy, although I doubt it matters and I suppose we could support moves in the build method.
> > > > The issues with the builder interface is that it requires lots of boilerplate, which is hard to throw away when reading the code of the class. I agree that having a separate builder class is also annoying (more concepts, etc).
> > > > 
> > > > Keeping them separate would be my personal preference, but if you'd prefer to keep it in the same class than maybe move the non-builder pieces to the top of the class and separate the builder methods with a comment. 
> > > > WDYT? 
> > > > 
> > > > > To a lesser extent, I also don't love the cost of an extra copy, although I doubt it matters and I suppose we could support moves in the build method.
> > > > I believe we can be as efficient (up to an extra move) with builders as without them. If most usages are of the form `RewriteRule R = rewriteRule(...).change(...).replaceWith(...).because(...);`
> > > > Then we could make all builder functions return r-value reference to a builder and have an implicit conversion operator that would consume the builder, producing the final `RewriteRule`:
> > > > ```
> > > > class RewriteRuleBuilder {
> > > >   operator RewriteRule () &&;
> > > >   /// ...
> > > > };
> > > > RewriteRuleBuilder rewriteRule();
> > > > 
> > > > void addRule(RewriteRule R);
> > > > void clientCode() {
> > > >   addRule(rewriteRule().change(Matcher).replaceWith("foo").because("bar"));
> > > > }
> > > > ```
> > > I didn't realize that implicit conversion functions are allowed. With that, I'm fine w/ splitting.   Thanks!
> > Have you uploaded the new version? I don't seem to see the split.
> I have now.  FWIW, I've left both ref overloads, but what do you think of dropping the lvalue-ref overloads and only supporting rvalue refs?  Users can still pretty easily use an lvalue, just by inserting a trivial std::move() around the lvalue.  
Yeah, LG, having only a single overload means less boilerplate and the fluent-builder APIs are mostly used exclusively as r-values anyway.

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:38
+/// @{
+using ast_matchers::CXXCtorInitializerMatcher;
+using ast_matchers::DeclarationMatcher;
----------------
I'm not sure if this is in the LLVM style guide, but we might want to avoid introducing these names into `clang::tooling` namespaces in the headers.

My fear is that users will rely on those using without knowing that explicitly and won't add corresponding `using` directives or qualifiers to their `.cpp` files, making refactoring and moving the code around harder.

Could you fully-qualify those names in the header instead? There does not seem to be too many of them.

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:65
+
+using TextGenerator = std::function<llvm::Expected<std::string>(
+    const ast_matchers::MatchFinder::MatchResult &)>;
----------------
Why would a `TextGenerator`  fail?
I imagine all of the failure cases are programming errors (matchers in the rewrite rule were not aligned with the corresponding text generating function). For those cases, using the `assert` macro seems cleaner.

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:116
+  // never be the empty string.
+  std::string Target = RootId;
+  ast_type_traits::ASTNodeKind TargetKind;
----------------
NIT: maybe move all inits to the constructor?
To have all initializers in one place.

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:170
+
+  RewriteRuleBuilder(const RewriteRuleBuilder &) = default;
+  RewriteRuleBuilder(RewriteRuleBuilder &&) = default;
----------------
NIT: maybe remove the `=default` copy and move ctors and assignments?
They should be generated automatically anyway, right?

================
Comment at: clang/include/clang/Tooling/Refactoring/Transformer.h:191
+  RewriteRuleBuilder &&because(TextGenerator Explanation) &&;
+  RewriteRuleBuilder &&because(StringRef Explanation) && {
+    return std::move(std::move(*this).because(text(Explanation)));
----------------
NIT: maybe accept a `std::string` to let the clients pass a `std::string` if they already have it, avoiding extra copies in some cases?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:28
+
+namespace clang {
+namespace tooling {
----------------
Other files in the tooling library seem to be adding `using namespace clang` instead of putting the declaration into a namespace.
Could you please change the new code to do the same for consistency?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:30
+namespace tooling {
+namespace {
+using ::clang::ast_matchers::MatchFinder;
----------------
Why put using directives into an anonymous namespace?
I have not seen this pattern before, could you point me to explanations on why this is useful?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:44
+
+static bool isOriginMacroBody(const clang::SourceManager &source_manager,
+                              clang::SourceLocation loc) {
----------------
NIT: use `UpperCamelCase` for local variable names to conform to LLVM style.

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:110
+
+// Requires verifyTarget(node, target_part) == success.
+static CharSourceRange getTarget(const DynTypedNode &Node, ASTNodeKind Kind,
----------------
NIT: maybe add the corresponding assertion at the start of the function?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:110
+
+// Requires verifyTarget(node, target_part) == success.
+static CharSourceRange getTarget(const DynTypedNode &Node, ASTNodeKind Kind,
----------------
ilya-biryukov wrote:
> NIT: maybe add the corresponding assertion at the start of the function?
s/node/Node
s/target_part/TargetPart

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:111
+// Requires verifyTarget(node, target_part) == success.
+static CharSourceRange getTarget(const DynTypedNode &Node, ASTNodeKind Kind,
+                                 NodePart TargetPart, ASTContext &Context) {
----------------
NIT: consider merging `verifyTarget` into `getTarget` and making `getTarget` return `Expected<CharSourceRange>`.
Would allow avoiding to write one of the complicated switches and error-checking arguably looks just as natural in the `getTarget` as it is in `verifyTarget`.

Also, having the invariants closer to the code using them makes it easier to certify both are correct, e.g. seeing that `NamedDecl.isIdentifier()` was checked before accessing the `NamedDecl.getName()` in the same function is simpler.

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:134
+      auto R = CharSourceRange::getTokenRange(TokenLoc, TokenLoc);
+      // Verify that the range covers exactly the name. FIXME: extend this code
+      // to support cases like `operator +` or `foo<int>` for which this range
----------------
NIT: start a fixme at a new line to make it more visible.
Or is it `clang-format` merging it into the previous line?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:144
+    if (const auto *E = Node.get<clang::DeclRefExpr>()) {
+      TokenLoc = E->getLocation();
+      break;
----------------
This could be `operator+` too, right?
```
struct X {
  X operator+(int);

  void test() {
    X (X::*ptr)(int) = &X::operator+;
  }
};
```

Would probably want to bail out in this case as well.

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:160
+                                          const MatchResult &Result) {
+  // Ignore results in failing TUs or those rejected by the where clause.
+  if (Result.Context->getDiagnostics().hasErrorOccurred())
----------------
The comment still mentions the where clause. A leftover from the previous version?

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:199
+
+// `Explanation` is a `string&`, rather than a `string` or `StringRef` to save
+// an extra copy needed to intialize the captured lambda variable.  After C++14,
----------------
Maybe use `std::string` in the public interface anyway? 
- This function is definitely not a bottleneck, so an extra copy is not a big deal for performance.
- Using `std::string` would let the clients save a copy in their client code if they can (by using `std::move` or creating an r-value string in the first place).
- We will have a copy in the body of our function (we have it anyway). We'll eliminate it after switching to C++14 in the body of our function without changing the clients.

================
Comment at: clang/lib/Tooling/Refactoring/Transformer.cpp:228
+  if (auto Err = AC.replace(*Result.SourceManager, Range, Change.Replacement)) {
+    AC.setError(llvm::toString(std::move(Err)));
+  }
----------------
NIT: remove braces around a single statement.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59376/new/

https://reviews.llvm.org/D59376