[cfe-dev] [RFC] Adding refactoring support to Clang
Manuel Klimek via cfe-dev
cfe-dev at lists.llvm.org
Tue Jun 20 01:27:59 PDT 2017
First, thanks for proposing this - this fits pretty much exactly with what
we had envisioned / slowly started working towards, so big thumbs up :)
Others have already made a lot of interesting observations / points, adding
a couple of thoughts inline.
On Fri, Jun 16, 2017 at 1:57 AM Alex Lorenz <aleksei_lorenz at apple.com>
wrote:
>
> Hi,
>
> This is a proposal for a new Clang-based refactoring engine (as indicated
> by Duncan last week [1]). The new refactoring engine should be based on
> both the existing code from llvm.org and the code that I've just
> committed to github.com/apple/swift-clang [2] (Please note that it has
> some outdated designs and APIs that I'd like to replace). This email is
> split into three different sections:
> - The first section lists the refactoring actions supported by Xcode 9.
> - The second section describes the actual proposal.
> - The third section outlines an action plan that I would like to follow
> when submitting patches for the new refactoring engine.
>
> I'm looking forward to any feedback that you might have!
>
> # Background
>
> At first I would like to provide a little bit of background on the
> C++/Objective-C refactorings in Xcode. Xcode 9 contains a Clang-based
> refactoring engine that supports several refactoring operations. The
> primary operation that it supports is “Rename”. The refactoring engine
> gathers the renamed symbol occurrences using two distinct mechanisms:
>
> - Local rename. It gathers occurrences for declarations that are declared
> in a function's body. It is based on existing code from `clang-rename`
> that was moved over to Clang's `Tooling` library. It was then modified
> slightly to fit our goals. I also added support for some Objective-C
> declarations.
>
> - Global rename. It gathers occurrences for all other declarations. We
> decided to put the high-level logic for global rename into Xcode's indexer
> because Xcode had to support cross-language rename (between Swift and
> Objective-C). The indexer uses the core engine in Clang’s lib/Index, so
> Clang is still very much involved in this process. The indexer also
> delegates low-level “indexed-file” renames to Clang. This lets us gather
> occurrences for symbolic references that don't have the full location
> information (like Objective-C selector names), textual matches for
> occurrences of a matching string in comments, and textual matches for a
> matching filename in the `#include` directives.
>
> Xcode 9 also support some local refactoring actions:
> - Extract function/method.
> - Extract duplicate expression.
> - Add missing switch cases (GIF:
> https://reviews.llvm.org/file/data/q2qz53jd7ug4cpx4zlfx/PHID-FILE-aq75y26bgkfvm5e775jt/switchCases).
> It is also available as a fixit.
> - [C++] Fill-in method stubs from abstract base classes.
> - Convert if to switch.
> - [ObjC] Fill in protocol stubs. It is also available as a fixit.
> - [ObjC] Wrap in NSLocalizedString
>
> Another action that Xcode supports is called “Generate Missing Function
> Definitions” (GIF:
> https://reviews.llvm.org/file/data/u4kzt62wjdin5dpl65jl/PHID-FILE-w7jl2vz42b5aybd6oizj/generateFunctionDefinitions).
> This action actually depends on the indexer to determine the set of methods
> that have to be implemented and where should the generated code go. It’s
> also the one and only cross-TU refactoring action that is currently
> supported (without taking "rename" into account).
>
> # Refactoring Engine Proposal
>
> I would like to create a new refactoring engine that will go into Clang's
> `Tooling` library. The engine's API should be designed with different use
> cases in mind, and I would like to ensure that the engine accommodates the
> following three use cases:
>
> - Libclang. It should provide a C-based refactoring API that we can use
> in Xcode.
> - Clangd.
> - A standalone refactoring tool, `clang-refactor`. This tool will be used
> for testing. It will also allow users to perform refactoring actions
> without using other tools.
>
> The engine should try to follow the following high level goals:
> - It should be easy for others to create, test, and integrate new local
> refactoring operations.
> - Refactoring operations should be allowed to fail. The engine should
> have good error reporting mechanisms.
> - Initiation via source selection should be quick (if we assume that we
> have access to a parsed AST) as we'd like to let a user know which actions
> are available as soon as possible.
> - The engine should allow at least a basic level of configurability and
> customization (users should be able select options for a refactoring action
> if an action chooses to provide some). In the future we might also want to
> provide more powerful configuration mechanisms that will allow IDEs to
> generate custom UIs for different actions.
>
> ## Initiation
>
> I think that there are two equally important modes of initiation that
> should be supported by Clang's refactoring engine:
>
> - Initiation via source selection. This mode is important for IDEs.
> - Initiation via AST queries. This mode is important for standalone tools
> / scripts that want to refactor an entire project.
>
> Unfortunately our code doesn't provide a good enough abstraction that
> generalizes well for different actions. I'm still hashing out a better
> initiation scheme, and I will send a separate RFC with our proposal for
> initiation in the near future. Let me know if you have any
> ideas/suggestions!
>
> ## Refactoring
>
> I would like to propose keeping the two modes that were previously
> described for the "Rename" operation. This will probably mean that a
> standalone tool like `clang-refactor` won't support the global rename
> mode of rename. On the other hand, if Clangd will gain some form of
> indexing infrastructure (which seems likely given the recent interest in
> indexing [3]), it will be able to use the engine's global renaming
> infrastructure as well. I'm hoping the community will provide some input on
> what should happen to `clang-rename`, as I'm not sure whether it should
> stay or be deprecated in favour of new tools.
>
> Other refactoring operations will primarily work with source
> modifications. I think that the refactoring engine should provide a set of
> shared utility components that can be reused by different operations. For
> example, things like return type deduction for extracted declarations
> should be accessible to all operations. This kind of design should also
> make testing easier.
>
> Certain refactoring actions (e.g. "Fill-in switch cases") work well
> together with diagnostic fix-its, as they can be presented in an editor and
> a user can initiate a refactoring operation using a UI that's already
> familiar to them. I would like to ensure that this kind of mechanism is
> accessible to refactoring operations that need it. Our current code
> contains two actions that are implemented in the `Edit` library. They're
> usable from `Sema`, and we take advantage of that to generate the fix-its
> during semantic analysis.
>
> ## Cross-TU Operations & Indexer Queries
>
> The majority of refactoring actions that we've implemented perform source
> transformations in a single translation unit (TU). However, there are some
> actions that need to work across TU boundaries. Xcode 9 includes one such
> cross-TU action - “Generate Missing Function Definitions”. A naïve cross-TU
> solution that I've looked at initially was based on an API that had a
> series of callbacks between Xcode's indexer and libclang. However, I
> quickly discovered that this model didn't work that well, as it interfered
> with the code that loaded and handled different TUs in the indexer.
>
> Note: while "Rename" is a global operation, it doesn't actually interact
> with other TUs because source replacements can be derived from the data
> stored in the project's index. Thus, our implementation doesn't treat it as
> a cross-TU operation.
>
> The current implementation of the “Generate Missing Function Definitions"
> action is based on the "refactoring continuations" API. A refactoring
> continuation is just a suspended refactoring operation that can be executed
> in a TU that's different to the one in which the operation was created.
> Each continuation includes TU-specific state that's automatically managed
> by Clang, as well as a set of questions about that state. Each individual
> question is called an "indexer query".
>
> Indexer queries are the core structures that enable cross-TU actions. An
> external component (like Xcode's indexer) has to evaluate them before
> running the continuation in order for the operation to work correctly. Each
> query can be described using four different attributes:
>
> - Question: The kind of question that should be answered. Our
> implementation exposes the following two kinds at the libclang level: `
> Decl_FileThatShouldImplement` and `Decl_IsDefined`.
> - Subject: The TU-specific state from the parent continuation that acts
> as the question's subject.
> - Result: The value that has to be set by the external indexer component
> before running the refactoring continuation.
> - Action: A flag that represents an additional action that has to be
> performed by the indexer before running the continuation. We use the `
> RunContinuationInTUThatHasThisFile` action in Xcode's indexer to
> determine which TU should be loaded and used when running the refactoring
> continuation.
>
> Please note that while I mention "Xcode's indexer" quite a lot in this
> section, our refactoring continuation API is indexer-agnostic, and is
> designed to work with any indexer. The indexer just has to implement
> support for the indexer queries that are required for a particular
> refactoring operation, and the refactoring engine handles everything else.
> The code sample below shows how our code constructs the refactoring
> continuation for the "Generate Missing Function Definitions" action:
>
> ```
> return continueInExternalASTUnit(
> /*Indexer query that gets annotated with the
> RunContinuationInTUThatHasThisFile action */
> fileThatShouldContainImplementationOf(Container),
> /*Pointer to the continuation function*/ runInImplementationAST,
> /*A state value*/ Container,
> /*An additional indexer query over some state*/
> filter(llvm::makeArrayRef(SelectedMethods), [](const DeclEntity &D) {
> return !D.isDefined(); }));
> ```
>
I think the harder design space will be how to run something on "all files
containing references to <X>". What we definitely need is an interface that
allows the ops to be sharded out across a fleet of machines (we use C++ ;)
based on the expected amount of work. This has some restrictions on the
design, specifically also is why we use SourceManager independent classes
to handle replacements etc.
Generally, I fully agree that we can get an interface here that is indexer
independent.
>
> In practical terms, the following series of events occur in Xcode when it
> performs the “Generate Missing Function Definitions” operation:
>
> - After the user chooses the "Generate Missing Function Definitions”
> action, Xcode notifies its indexer that it has to perform that action. The
> indexer then uses libclang to get the refactoring results after running the
> action's operation in the initiation TU.
> - Xcode's indexer sees that the results provide a refactoring
> continuation instead of source replacements, so it starts looking at the
> attached indexer queries.
> - Xcode's indexer evaluates each query by looking at the question's kind
> and the TU-specific state (A set of `CXCursor`s). It sends back the
> result to libclang using a set of query-agnostic routines that consume
> different data types.
> - After evaluating the queries, Xcode's indexer loads the TU in which the
> refactoring continuation should run. It knows which TU it has to load by
> looking at the result of the query that included the `
> RunContinuationInTUThatHasThisFile` action flag.
> - Xcode's indexer then invokes the refactoring continuation in the
> freshly loaded TU. Clang converts the continuation's state from previously
> serialized TU-independent state to a state that's specific to the new TU,
> and continues running the refactoring operation with this state.
> - Xcode's indexer then receives the source replacements produced by the
> refactoring operation and passes them back to Xcode so that it can apply
> the replacements to the source.
>
> I would like to propose an API model that's based on our current
> "refactoring continuations" API for the cross-TU operations in the new
> refactoring engine. Clang will provide a C++ API that can be used in a way
> that's similar to the code presented above. It will also manage all of the
> interactions with libclang and Clangd, so individual refactoring operations
> won't have to worry about the fine details of a particular indexer.
> Furthermore, the new refactoring engine will limit the set of available
> actions based on the set of queries that are supported by an indexer, so
> the indexer that chooses to support refactoring continuations won't have to
> support all of the queries.
>
> ## Testing & Tooling
>
> The `clang-refactor` tool will be used to test the refactoring engine. I
> would like to propose the following action-agnostic command line interface:
>
> ```
> clang-refactor <action> [options] <initiation file> [additional build
> options]
> ```
>
> I think that the tool should use comment-based annotations in the test
> source files to control things like initiation. For example, given a
> selection annotation `// selection: +1:1 -> +2:12` on line 12, `
> clang-refactor` will map it to a selection range of 13:1 -> 14:12 and
> will initiate a certain refactoring action using that selection.
>
Ok, already bikeshedding: can we call it a range instead of a selection?
> In addition to `clang-refactor`, I would also like to provide an
> additional stress-test tool that will work with entire projects (using
> compilation databases), and will be capable of:
>
> - Initiating/performing refactoring actions at each token in all of the
> source files in a project.
> - Verifying the consistency between the indexer and the refactoring
> engine to ensure that they have the same model of the source.
>
> This tool will also come with a script that generates compilation
> databases for Clang's tests.
>
> I also have a great little reviewing tool that helps others to visualize
> the refactoring changes for a particular action, and I'd be more than happy
> to share it if you're interested. The tool generates a set of HTML pages
> that contain the sources of the refactoring tests. The pages get annotated
> to show the changes that were made during refactoring. We found that
> looking at this visualization instead of the raw `CHECK` lines makes
> reviewers' life much easier.
>
Nice! Additionally, I'd also suggest to use unit tests, especially for
small dedicated regression tests where the AST is complex (might be more of
a problem for C++).
>
> # Action Plan
>
> Initially I would like to adopt the following high-level plan for the
> creation of the new refactoring engine:
>
> - Upstream any non-refactoring specific changes from our code.
> - Move the core of `clang-rename` over to clang (`clang-rename` should
> still work as before).
> - Start working on the `clang-refactor` tool so that it can use the
> previously moved code. Move the tests from `clang-rename` over to Clang
> and convert them to `clang-refactor` tests.
> - Submit our changes to the core of the `clang-rename` engine.
> - Create and test the new global rename component.
> - Create an AST source selection component. Initiation via source
> selection will be based on this component.
> - Create a refactoring initiation API that will be easy to use.
> - Start submitting our local refactoring actions.
> - Start working on the libclang API.
> - Create and submit everything else that remains, e.g. Cross-TU
> Operations & Indexer Queries.
>
Sounds generally good. Thanks!
>
> Cheers,
> Alex
>
> [1]: http://lists.llvm.org/pipermail/cfe-dev/2017-June/054108.html
> [2]:
> https://github.com/apple/swift-clang/commit/9890adfbee8f854732d0093bc8b2a32be1be8844
> [3]: http://lists.llvm.org/pipermail/cfe-dev/2017-May/053869.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170620/c18926bd/attachment.html>
More information about the cfe-dev
mailing list