[libcxx-dev] RFC: A top level monorepo CMake file

Petr Hosek via libcxx-dev libcxx-dev at lists.llvm.org
Thu Jun 18 14:26:33 PDT 2020


+1 for this change.

On Thu, Jun 18, 2020 at 10:57 AM Louis Dionne <ldionne at apple.com> wrote:

> Hi folks,
>
> Building any LLVM project currently requires invoking CMake inside
> <monorepo-root>/llvm, while setting the projects to enable in the
> LLVM_ENABLE_PROJECTS variable. This has the downside that CMake processing
> for the LLVM subproject happens even when one doesn't really need or want
> it. It's also not great from a build hygiene perspective, as LLVM globally
> sets some flags and subprojects pick them up, when they don't really mean
> to. For example, see this workaround:
> https://github.com/llvm/llvm-project/blob/master/libcxx/CMakeLists.txt#L503-L507,
> where we need to account for some flags that might have been set globally
> by LLVM.
>
> I'm not sure about other projects, however this is quite problematic for
> projects part of the C++ runtime (libc++/libc++abi/libunwind). Indeed, we
> often try to build those projects targetting not widely supported
> platforms, where the overall LLVM build doesn't work. For example, trying
> to use the LLVM_ENABLE_PROJECTS approach for building libc++ for Apple's
> DriverKit environment doesn't work, since it has a few unusual things that
> the LLVM build chokes on. However, building libc++ standalone works just
> fine because it has far fewer requirements. It's also not just an issue of
> working vs not working: because of global flag pollution, building libc++
> standalone and as part of the rest of LLVM can result in slightly different
> flags being used, which could cause important and hard-to-diagnose issues.
>
> Hence, I think we should introduce a way to build LLVM projects (or at
> least the runtimes) without going through
> <monorepo-root>/llvm/CMakeLists.txt. What I suggest is to have a top-level
> <monorepo-root>/CMakeLists.txt whose sole job is to include subprojects. We
> could also place basic LLVM-wide things like the check for the minimum
> CMake version there. More specifically, I would like to be able to do:
>
>     $ cd <monorepo-root>
>     $ mkdir build
>     $ (cd build && cmake <monorepo-root>
> -DLLVM_ENABLE_PROJECTS="<projects-to-enable>")
>
> Pretty much the only difference with today is that you'd use `cmake
> <monorepo-root>` instead of `cmake <monorepo-root>/llvm`.
>
> Like I said, this is a problem for the runtime projects, but I'm not sure
> about other projects. For the runtime projects, another option would be to
> only allow standalone builds. However, the runtime projects are often built
> in lockstep, and so running three CMake commands when one would suffice is
> both annoying and also an easy way to screw things up. Furthermore, the
> current standalone builds add complexity to the projects, because they
> require the ability to point to arbitrary headers/libraries from the other
> projects, when we really always want to point to the just-built ones.
>
> Relationship with Petr Hosek's "Runtimes" build
> ---------------------------------------------------------------
> What I'm proposing isn't a replacement for itl. The "Runtimes" build can
> be seen as a driver that sets up the individual libc++/libc++abi/libunwind
> builds with the just-built toolchain, and for the provided targets. That's
> really great, however it is built *on top of* the basic
> libc++/libc++abi/libunwind builds. So basically, after my proposal, the
> "Runtimes" build could simply build all elements from the runtime with a
> single CMake invocation, as opposed to multiple invocations.
>

I think there may be a misunderstanding of how the "runtimes" build work.
It already uses an equivalent of:

cmake -DCMAKE_CXX_COMPILER=<path to just built clang>
-DCMAKE_CXX_COMPILER=<path to just built clang++> <options>
 -DLLVM_ENABLE_PROJECTS="libcxx;libcxxabi;libunwind" <llvm-project-root>

The reason why it doesn't do exactly that is because LLVM's root
CMakeLists.txt does too many things, and doesn't do some of the things we
need.

Instead we use a trick where llvm/runtimes/CMakeLists.txt re-invokes itself
for different targets. When invoked as a the root file it drives the build
for all runtimes, resembling the CMake invocation above, but it also
exposes a "build API" to the parent build, so as the user of the "runtimes"
build, you use the parent build and it drives the child builds through this
API.

When using runtimes build, you have to make one CMake invocation to build
tools, and then one CMake invocation per-target to build runtimes (but
*not* one CMake per project). I don't think there's a way to get down to a
single CMake invocation unless CMake gains support for "scoped toolchains"
(today there's only one global host toolchain), which is something that GN
has and why in GN this is possible.

I don't think that having a top-level CMake file changes anything for the
"runtimes" build. We could consider merging llvm/runtimes/CMakeLists.txt
into the top-level CMake file, but I don't see any immediate gains aside
from clearer file structure.

I think a bigger win, and not just for the runtimes build, would be to have
a global CMake modules directory that would be shared by all subprojects
avoiding the duplication we currently have, and allow sharing cached
variables between runtimes which should significantly reduce the number of
CMake checks we have to run. For example, today every runtime does the same
set of checks to ensure your system has libc, libm, pthreads, etc. We
really should only ever have to run those once per CMake invocation.

Thoughts?
> Louis
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20200618/f35f0c91/attachment-0001.html>


More information about the libcxx-dev mailing list