[llvm-dev] [RFC] A vision for building the runtimes
Louis Dionne via llvm-dev
llvm-dev at lists.llvm.org
Thu Oct 22 15:31:15 PDT 2020
The topic of how to build the runtimes has been brought up several times in the past, sometimes by me, sometimes by others. All in all, building the runtimes is fairly complex: there's several ways of doing it, and they don't always work. This complexity leads to several problems, the first one being that I, as a maintainer, can't currently guarantee a good experience for people trying to build the library. What follows is a concrete proposal to make things better within a reasonable time frame.
The current state of things
The runtimes (libc++, libc++abi and libunwind) currently support the following ways of being built:
This is the "easy" and most common way to build the runtimes. It builds the runtimes as subprojects of LLVM (with LLVM_ENABLE_PROJECTS), with the same compiler that's used to build LLVM.
However, this is mostly wrong, since it encourages users to build libc++ using the system compiler, not the just-built compiler. Since a recent compiler is necessary for correctness (e.g. the RTTI for fundamental types generated in libc++abi), this is a real issue.
This also requires the whole LLVM CMake machinery to work on the platform where one wants to build the runtimes. This doesn't really work on most embedded platforms, so they just can't use the monorepo build.
This also suffers from issues like the fact that LLVM sets several global variables/flags that subprojects inherit. While it may make sense for some subprojects to use these defaults (e.g. the default C++ Standard), it is actively harmful for the runtimes, which need to have a tight control over these things.
Each runtime project also supports a Standalone build. This is a build where the root CMakeLists.txt is the one from the project itself. This is nice because it's lightweight, and it doesn't require all the LLVM CMake setup to work, which solves problems for embedded platforms.
Before the monorepo era, this type of build also made sense cause you could build one runtime without checking out the other ones, however that is not true anymore (the runtimes all share code that requires them to be co-located in the monorepo even if you're just building one of them).
This type of build has the significant downside that we need to tie together the various runtime projects using CMake variables. For example, we have to tell libc++abi where to find the libc++ headers, and we have to tell libc++ where to find the library for libc++abi. This leads to a plethora of CMake options that are brittle and add a lot of complexity (LIBCXX_CXX_ABI_INTREE, LIBCXXABI_LIBCXX_INCLUDES, etc.).
- The llvm/runtimes build
I like to call this the Toolchain build instead, cause that's really what it does. It builds the runtimes using the just-built toolchain, and with the goal of including those runtimes in the toolchain. It's more of a driver for the individual builds than a build configuration itself. It's currently built on top of the Standalone builds -- it builds the toolchain and then builds the various runtimes individually, stringing them together as required.
My goal with this proposal is to achieve:
1. Decoupling from the top-level LLVM CMake setup (which doesn't work, see above)
2. A simple build that works everywhere, including embedded platforms
3. Remove the need to manually tie together the various runtimes (as in the Standalone builds)
My proposal is basically to have a "Unified Standalone" build for all the runtimes. It would look similar to a Monorepo build in essence (i.e. you'd have a single CMake invocation where you would specify the flags for all runtime projects at once), but it wouldn't be using the top-level LLVM CMake setup . Specifically:
1. Add a `runtimes/CMakeLists.txt` file that includes the runtimes subprojects that are requested through -DLLVM_ENABLE_PROJECTS (open to bikeshed), and sets up minimal stuff like the `llvm-lit` wrapper and Python, but none of the harmful stuff that's done by the top-level LLVM CMake.
2. Deprecate the old individual Standalone builds for this new "Unified Standalone build".
3. Users migrate to the new Unified Standalone build. Users include the current "Runtimes" build, some places in compiler-rt, and various organizations.
4. Remove support for the old individual Standalone builds.
As a second step, we should also:
5. Deprecate the current Monorepo build in favor of either the Unified Standalone build (for those who just wish to build the runtimes), or the current Runtimes (aka Toolchain) build, for those who wish to build a toolchain.
6. Let users migrate to either
7. Remove support for the Monorepo build (e.g. make it an error to try and build one of the runtimes through the current Monorepo build).
At the end of this transition, we'd hence have a single way to build all the runtimes, and a "driver" to build them with the just-built toolchain.
Moving towards a single CMake invocation for the Standalone builds is the key element of this proposal that will make everything simpler, and remove the need to setup a bunch of things manually. It will also make it easier to start sharing more code across the various runtimes .
I have already written the CMakeLists.txt for the "Unified Standalone" build, and I've been using it to build libc++ and libc++abi at Apple. It is incredibly simple, and it works well so far.
I'm looking forward to your comments,
 If you're wondering what that would look like:
$ mkdir <monorepo-root>/build
$ cd <monorepo-root>/build
$ cmake ../runtimes -DLLVM_ENABLE_PROJECTS="libcxx;libcxxabi;libunwind" \
-C <path-to-your-cache-if-desired> \
$ ninja install-cxx install-cxxabi
 If you're wondering, I'm not proposing to remove being able to build libc++ against other ABI libraries, or any such thing. The Unified Standalone build would retain the same amount of flexibility as today.
More information about the llvm-dev