[llvm-dev] [cfe-dev] [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM
Mehdi AMINI via llvm-dev
llvm-dev at lists.llvm.org
Tue Jun 2 11:22:48 PDT 2020
On Tue, Jun 2, 2020 at 10:45 AM Chris Tetreault via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Seems reasonable to me. Much better than putting a bunch of Fortran stuff
> in clang.
>
> I wonder if, instead of just putting it in llvm, it makes sense to have
> some sort of "llvm-frontend" subproject? It could contain all the language
> agnostic frontend stuff being pulled out of clang, as well as the clang and
> flang frontends. We currently already kind of have this divide with clang
> being the only in tree frontend, having its own subproject outside of llvm
> and having all of this stuff.
>
+1: it seems like a large part of the code here wouldn't depend on LLVM IR
and could just live on the side at the top-level?
--
Mehdi
>
> Thanks,
> Christopher Tetreault
>
> -----Original Message-----
> From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Andrzej
> Warzynski via cfe-dev
> Sent: Tuesday, June 2, 2020 5:08 AM
> To: llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org;
> flang-dev at lists.llvm.org
> Subject: [EXT] [cfe-dev] [RFC] Refactor Clang: move
> frontend/driver/diagnostics code to LLVM
>
> *TL;DR*
>
> We propose some non-trivial refactoring in Clang and LLVM to enable
> further work on Flang driver.
>
> *SUMMARY*
> We would like to start extracting the driver/frontend code from Clang
> (alongside the code that the driver/frontend depends on, e.g.
> Diagnostics) and move the components that could be re-used by non-C-based
> languages to LLVM. From our initial investigation we see that these changes
> will impact many projects (upstream and downstream) and will require big
> mechanical patches (our first attempt is implemented in [8]). This is not
> ideal, but seems unavoidable in the long-term. We would like to do this
> refactoring _before_ we start implementing the Flang driver upstream
> (OPTION 1 below). This way we avoid:
>
> * contaminating Clang with Fortran specific code (and vice versa)
> * introducing dependency on Clang in Flang
>
> The downside is that the refactoring is likely to be disruptive for many
> projects that use Clang. We will try our best to minimise this.
>
> Does this approach make sense? Are there any preferred alternatives? At
> this stage we'd like to discuss the overall direction. If folks are in
> favour, we'll send a separate RFC with a finer breakdown and more technical
> details for the refactoring.
>
> Below you will find more context for our use-case (the Flang driver) and
> possible alternatives. We hope that this will help the discussion. We would
> really appreciate your feedback!
>
>
> *BACKGROUND*
> Flang (formerly known as F18) has recently been merged into LLVM [1].
> Our ambition, as a community, is to make it as flexible, robust and nice
> to work with as Clang. One of the major items to address is the
> implementation of a driver that would provide the flexibility and user
> experience similar to that available in Clang. The F18/Flang driver was
> already discussed on cfe-dev last year [2], but back then F18 (now llvm
> project/flang) was a separate project. In the original proposal it was
> assumed that initially Flang would depend (and extend where necessary)
> Clang's driver/frontend code. Since F18/Flang was an independent project,
> the refactoring of Clang/LLVM wasn't really considered. That design has
> been challenged since ([3], [10]), and also not much progress has been
> made. We would like to revisit that RFC from a slightly different angle.
> Since Flang is now part of LLVM's monorepo, we feel that refactoring
> Clang/LLVM _before_ we upstream the driver makes a lot of sense and is the
> natural first step.
>
> *ASSUMPTIONS & DESIGN GOALS*
> 1. We will re-use as much of the Clang's driver/frontend code as possible
> (this was previously proposed in [2]).
>
> 2. We want to avoid dependencies from Flang to Clang, both long-term
> (strong requirement) and short-term (might be difficult to achieve).
> This has recently come up in a discussion on one of our early patches [3]
> (tl;dr Steve Scalpone, the code owner of Flang, would prefer us to avoid
> this dependency), and was also suggested before by Eric Christopher [10].
>
> 3. We will move the code that can be shared between Flang and Clang (and
> other projects) to LLVM. This idea has already come up on llvm-dev before
> [7] (in a slightly different context, and to a slightly different extent).
> The methods that are not language specific would be shared in an LLVM
> library.
>
> 4. The classes/types/methods that need specific changes for Fortran will
> be "copied" to Flang and adapted as needed. We should minimize (or even
> eliminate) any Fortran specific code from Clang and make sure that that
> lives in llvm-project/flang.
>
> *FLANG'S DEPENDENCIES ON CLANG*
> These are the dependencies on Clang that we have identified so far while
> prototyping the Flang driver.
>
> 1. All the machinery related to Diagnostics & SourceLocation.
>
> This is currently part of libclangBasic [4] and is used in _many_ places
> in Clang. The official documentation [5] suggests that this could be
> re-used for non-C-based languages. In particular, we feel that It would
> make a lot of sense for Flang to use it. Also, separating Clang's
> driver/frontend code and the diagnostics would require a lot of refactoring
> for no real benefit (and we feel that Flang should re-use Clang's
> driver/frontend code, see below). This dependency is used in many places,
> so moving it to LLVM will require a lot of (mostly) mechanical changes. We
> can't see an obvious way to split it into smaller chunks (see also below
> where we discuss the impact).
>
> 2. libclangFrontend & libclangDriver
>
> The Flang driver will use many methods from libClangDriver,
> libClangFrontend and libClangFrontendTool. Driver.h and Compilation.h from
> libClangDriver are responsible to call, pass the correct arguments and
> execute the driver. TextDiagnosticPrinter.h takes care of printing the
> driver diagnostics in case of errors.
>
> The Flang frontend will use CompilerInstance, CompilerInvocation,
> FrontendOptions, FrontendActions and Utils from libClangFrontend and
> libClangFrontendTool. These methods are responsible for translating the
> command line arguments to frontend Options and later to Actions to be
> executed by ExecuteCompilerInvocation. The translation from arguments to
> Actions happens with FrontendOption and FrontendActions. But it is the
> CompilerInvocation that has the pointers for the sequence of Actions that
> are required in a Compiler Instance. These methods are needed to implement
> Flang driver/frontend and contain actions/method/functions that seem to be
> language agnostic.
>
> *ALTERNATIVES*
> This is a summary of the alternative ways of implementing the Flang
> driver. We propose OPTION 1. If there are no major objections, we will
> draft a separate RFC with more technical details (we will also break it
> down into smaller pieces). Otherwise, what would be your preferred
> alternative and why?
>
> OPTION 1
> We avoid dependency on Clang from Day 1.
>
> This is the ideal scenario that would guarantee that Clang and Flang are
> completely separate and that the common bits stay in LLVM instead. It would
> mean slower progress for us initially, but then other projects could
> benefit from the refactoring sooner rather than later.
>
> OPTION 2
> We avoid dependency on clangBasic from day 1, but initially allow
> dependency on libClangFrontend & libClangDriver (or other libs specific to
> the driver/frontend).
>
> The dependency on libclang{Driver|Frontend} would gradually be
> removed/refactored out as the driver for Flang gains momentum. As mentioned
> earlier, there is plenty of code in libClangFrontend and libClangDriver
> that we'd like to re-use, but the separation between code that's specific
> to C-based languages and generic driver/frontend code is not always
> obvious. We think that refactoring the common bits in libClangFrontend and
> libClangDriver might simply be easier once:
>
> * we have a Flang driver that leverages these libraries, and, as a
> result,
> * we understand better what we could re-use and what's not that relevant
> to non-C-based languages.
>
> OPTION 3
> We initially keep the dependency on Clang and re-visit this RFC later.
>
> This would be the least disruptive approach (at least for the time
> being) and would allow us to make us the most rapid progress (i.e. we
> would be focusing on implementing the features rather than refactoring).
> It would also inform the future refactoring better. But it was already
> pointed out that we should avoid dependencies on clang [3] and this would
> be a step in the opposite direction. Also, the build requirements for Flang
> would increase, and we feel that we should strive to reduce them instead
> [6].
>
> If we missed any alternatives, please bring them up.
>
> *IMPACT ON OTHER PROJECTS*
> The refactoring will have non-trivial impact on other projects:
>
> * OPTION 1 and OPTION 2 - huge impact initially.
> * OPTION 3 - no impact initially, but most likely similar impact as OPTION
> 1 and OPTION 2 in the long term.
>
> From our initial investigation, extracting Diagnostics/SourceLocation
> from clangBasic and moving it to LLVM will be the most impactful change.
> Within llvm-project it is used in clang, clang-tools-extra, lldb and
> polly. Most of the changes will be mechanical, but will require touching
> many files. In order to get to a state where we could build libclang using
> the newly defined LLVM library, we had to touch ~850 files and make ~30k
> insertions/deletions. The result of this exercise is available in our
> development fork of llvm-project [8].
>
> Please note: our patches on GitHub [8] are just experiments to illustrate
> the idea. It's work-in-progress that requires a lot of polishing. When/if
> up-streaming this, we would need to do some low-impact refactoring first.
> For example, currently ASTReader & ASTWriter are `friends` with
> DiagnosticsEngine [9]. That won't be possible when DiagnosticsEngine is
> moved to LLVM.
>
>
> On behalf of the Arm Fortran Team,
> Andrzej Warzynski
>
> REFERENCES
>
> [1]
>
> https://github.com/llvm/llvm-project/commit/b98ad941a40c96c841bceb171725c925500fce6c
> [2] http://lists.llvm.org/pipermail/cfe-dev/2019-June/062669.html
> [3] https://reviews.llvm.org/D79092
> [4]
>
> https://github.com/llvm/llvm-project/blob/ad5d319ee85d31ee2b1ca5c29b3a10b340513fec/clang/lib/Basic/CMakeLists.txt#L45-L47
> [5]
> https://clang.llvm.org/docs/InternalsManual.html#the-clang-basic-library
> [6] http://lists.llvm.org/pipermail/flang-dev/2019-November/000061.html
> [7] http://lists.llvm.org/pipermail/llvm-dev/2019-November/136743.html
> [8]
>
> https://github.com/banach-space/llvm-project/commits/andrzej/refactor_clangBasic
> [9]
>
> https://github.com/llvm/llvm-project/blob/b11ecd196540d87cb7db190d405056984740d2ce/clang/include/clang/Basic/Diagnostic.h#L985-L986
> [10] https://reviews.llvm.org/D63607
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200602/98713b35/attachment-0001.html>
More information about the llvm-dev
mailing list