[llvm-dev] [cfe-dev] [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM

Chris Tetreault via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 2 10:04:03 PDT 2020


Seems reasonable to me. Much better than putting a bunch of Fortran stuff in clang.

I wonder if, instead of just putting it in llvm, it makes sense to have some sort of "llvm-frontend" subproject? It could contain all the language agnostic frontend stuff being pulled out of clang, as well as the clang and flang frontends. We currently already kind of have this divide with clang being the only in tree frontend, having its own subproject outside of llvm and having all of this stuff.

Thanks,
   Christopher Tetreault

-----Original Message-----
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Andrzej Warzynski via cfe-dev
Sent: Tuesday, June 2, 2020 5:08 AM
To: llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org; flang-dev at lists.llvm.org
Subject: [EXT] [cfe-dev] [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM

*TL;DR*

We propose some non-trivial refactoring in Clang and LLVM to enable further work on Flang driver.

*SUMMARY*
We would like to start extracting the driver/frontend code from Clang (alongside the code that the driver/frontend depends on, e.g.
Diagnostics) and move the components that could be re-used by non-C-based languages to LLVM. From our initial investigation we see that these changes will impact many projects (upstream and downstream) and will require big mechanical patches (our first attempt is implemented in [8]). This is not ideal, but seems unavoidable in the long-term. We would like to do this refactoring _before_ we start implementing the Flang driver upstream (OPTION 1 below). This way we avoid:

* contaminating Clang with Fortran specific code (and vice versa)
* introducing dependency on Clang in Flang

The downside is that the refactoring is likely to be disruptive for many projects that use Clang. We will try our best to minimise this.

Does this approach make sense? Are there any preferred alternatives? At this stage we'd like to discuss the overall direction. If folks are in favour, we'll send a separate RFC with a finer breakdown and more technical details for the refactoring.

Below you will find more context for our use-case (the Flang driver) and possible alternatives. We hope that this will help the discussion. We would really appreciate your feedback!


*BACKGROUND*
Flang (formerly known as F18) has recently been merged into LLVM [1].
Our ambition, as a community, is to make it as flexible, robust and nice to work with as Clang. One of the major items to address is the implementation of a driver that would provide the flexibility and user experience similar to that available in Clang. The F18/Flang driver was already discussed on cfe-dev last year [2], but back then F18 (now llvm
project/flang) was a separate project. In the original proposal it was assumed that initially Flang would depend (and extend where necessary) Clang's driver/frontend code. Since F18/Flang was an independent project, the refactoring of Clang/LLVM wasn't really considered. That design has been challenged since ([3], [10]), and also not much progress has been made. We would like to revisit that RFC from a slightly different angle. Since Flang is now part of LLVM's monorepo, we feel that refactoring Clang/LLVM _before_ we upstream the driver makes a lot of sense and is the natural first step.

*ASSUMPTIONS & DESIGN GOALS*
1. We will re-use as much of the Clang's driver/frontend code as possible (this was previously proposed in [2]).

2. We want to avoid dependencies from Flang to Clang, both long-term (strong requirement) and short-term (might be difficult to achieve).
This has recently come up in a discussion on one of our early patches [3] (tl;dr Steve Scalpone, the code owner of Flang, would prefer us to avoid this dependency), and was also suggested before by Eric Christopher [10].

3. We will move the code that can be shared between Flang and Clang (and other projects) to LLVM. This idea has already come up on llvm-dev before [7] (in a slightly different context, and to a slightly different extent). The methods that are not language specific would be shared in an LLVM library.

4. The classes/types/methods that need specific changes for Fortran will be "copied" to Flang and adapted as needed. We should minimize (or even
eliminate) any Fortran specific code from Clang and make sure that that lives in llvm-project/flang.

*FLANG'S DEPENDENCIES ON CLANG*
These are the dependencies on Clang that we have identified so far while prototyping the Flang driver.

1. All the machinery related to Diagnostics & SourceLocation.

This is currently part of libclangBasic [4] and is used in _many_ places in Clang. The official documentation [5] suggests that this could be re-used for non-C-based languages. In particular, we feel that It would make a lot of sense for Flang to use it. Also, separating Clang's driver/frontend code and the diagnostics would require a lot of refactoring for no real benefit (and we feel that Flang should re-use Clang's driver/frontend code, see below). This dependency is used in many places, so moving it to LLVM will require a lot of (mostly) mechanical changes. We can't see an obvious way to split it into smaller chunks (see also below where we discuss the impact).

2. libclangFrontend & libclangDriver

The Flang driver will use many methods from libClangDriver, libClangFrontend and libClangFrontendTool. Driver.h and Compilation.h from libClangDriver are responsible to call, pass the correct arguments and execute the driver. TextDiagnosticPrinter.h takes care of printing the driver diagnostics in case of errors.

The Flang frontend will use CompilerInstance, CompilerInvocation, FrontendOptions, FrontendActions and Utils from libClangFrontend and libClangFrontendTool. These methods are responsible for translating the command line arguments to frontend Options and later to Actions to be executed by ExecuteCompilerInvocation. The translation from arguments to Actions happens with FrontendOption and FrontendActions. But it is the CompilerInvocation that has the pointers for the sequence of Actions that are required in a Compiler Instance. These methods are needed to implement Flang driver/frontend and contain actions/method/functions that seem to be language agnostic.

*ALTERNATIVES*
This is a summary of the alternative ways of implementing the Flang driver. We propose OPTION 1. If there are no major objections, we will draft a separate RFC with more technical details (we will also break it down into smaller pieces). Otherwise, what would be your preferred alternative and why?

OPTION 1
We avoid dependency on Clang from Day 1.

This is the ideal scenario that would guarantee that Clang and Flang are completely separate and that the common bits stay in LLVM instead. It would mean slower progress for us initially, but then other projects could benefit from the refactoring sooner rather than later.

OPTION 2
We avoid dependency on clangBasic from day 1, but initially allow dependency on libClangFrontend & libClangDriver (or other libs specific to the driver/frontend).

The dependency on libclang{Driver|Frontend} would gradually be removed/refactored out as the driver for Flang gains momentum. As mentioned earlier, there is plenty of code in libClangFrontend and libClangDriver that we'd like to re-use, but the separation between code that's specific to C-based languages and generic driver/frontend code is not always obvious. We think that refactoring the common bits in libClangFrontend and libClangDriver might simply be easier once:

  * we have a Flang driver that leverages these libraries, and, as a result,
  * we understand better what we could re-use and what's not that relevant to non-C-based languages.

OPTION 3
We initially keep the dependency on Clang and re-visit this RFC later.

This would be the least disruptive approach (at least for the time
being) and would allow us to make us the most rapid progress (i.e. we would be focusing on implementing the features rather than refactoring).
It would also inform the future refactoring better. But it was already pointed out that we should avoid dependencies on clang [3] and this would be a step in the opposite direction. Also, the build requirements for Flang would increase, and we feel that we should strive to reduce them instead [6].

If we missed any alternatives, please bring them up.

*IMPACT ON OTHER PROJECTS*
The refactoring will have non-trivial impact on other projects:

* OPTION 1 and OPTION 2 - huge impact initially.
* OPTION 3 - no impact initially, but most likely similar impact as OPTION 1 and OPTION 2 in the long term.

 From our initial investigation, extracting Diagnostics/SourceLocation from clangBasic and moving it to LLVM will be the most impactful change.
Within llvm-project it is used in clang, clang-tools-extra, lldb and polly. Most of the changes will be mechanical, but will require touching many files. In order to get to a state where we could build libclang using the newly defined LLVM library, we had to touch ~850 files and make ~30k insertions/deletions. The result of this exercise is available in our development fork of llvm-project [8].

Please note: our patches on GitHub [8] are just experiments to illustrate the idea. It's work-in-progress that requires a lot of polishing. When/if up-streaming this, we would need to do some low-impact refactoring first. For example, currently ASTReader & ASTWriter are `friends` with DiagnosticsEngine [9]. That won't be possible when DiagnosticsEngine is moved to LLVM.


On behalf of the Arm Fortran Team,
Andrzej Warzynski

REFERENCES

[1]
https://github.com/llvm/llvm-project/commit/b98ad941a40c96c841bceb171725c925500fce6c
[2] http://lists.llvm.org/pipermail/cfe-dev/2019-June/062669.html
[3] https://reviews.llvm.org/D79092
[4]
https://github.com/llvm/llvm-project/blob/ad5d319ee85d31ee2b1ca5c29b3a10b340513fec/clang/lib/Basic/CMakeLists.txt#L45-L47
[5] https://clang.llvm.org/docs/InternalsManual.html#the-clang-basic-library
[6] http://lists.llvm.org/pipermail/flang-dev/2019-November/000061.html
[7] http://lists.llvm.org/pipermail/llvm-dev/2019-November/136743.html
[8]
https://github.com/banach-space/llvm-project/commits/andrzej/refactor_clangBasic
[9]
https://github.com/llvm/llvm-project/blob/b11ecd196540d87cb7db190d405056984740d2ce/clang/include/clang/Basic/Diagnostic.h#L985-L986
[10] https://reviews.llvm.org/D63607
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


More information about the llvm-dev mailing list