[llvm] [docs] Rewrite HowToCrossCompileLLVM (PR #129451)
Alex Bradbury via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 11 08:07:19 PDT 2025
================
@@ -1,215 +1,236 @@
===================================================================
-How To Cross-Compile Clang/LLVM using Clang/LLVM
+How to cross-compile Clang/LLVM using Clang/LLVM
===================================================================
Introduction
-============
+------------
This document contains information about building LLVM and
-Clang on host machine, targeting another platform.
+Clang on a host machine, targeting another platform.
For more information on how to use Clang as a cross-compiler,
please check https://clang.llvm.org/docs/CrossCompilation.html.
-TODO: Add MIPS and other platforms to this document.
+This document describes cross-building a compiler in a single stage, using an
+existing ``clang`` install as the host compiler.
-Cross-Compiling from x86_64 to ARM
-==================================
+.. note::
+ These instructions have been tested for targeting 32-bit ARM, AArch64, or
+ 64-bit RISC-V from an x86_64 host. But should be equally applicable to any
+ other target.
-In this use case, we'll be using CMake and Ninja, on a Debian-based Linux
-system, cross-compiling from an x86_64 host (most Intel and AMD chips
-nowadays) to a hard-float ARM target (most ARM targets nowadays).
-
-The packages you'll need are:
-
- * ``cmake``
- * ``ninja-build`` (from backports in Ubuntu)
- * ``gcc-4.7-arm-linux-gnueabihf``
- * ``gcc-4.7-multilib-arm-linux-gnueabihf``
- * ``binutils-arm-linux-gnueabihf``
- * ``libgcc1-armhf-cross``
- * ``libsfgcc1-armhf-cross``
- * ``libstdc++6-armhf-cross``
- * ``libstdc++6-4.7-dev-armhf-cross``
-
-Configuring CMake
------------------
-
-For more information on how to configure CMake for LLVM/Clang,
-see :doc:`CMake`.
-
-The CMake options you need to add are:
-
- * ``-DCMAKE_SYSTEM_NAME=<target-system>``
- * ``-DCMAKE_INSTALL_PREFIX=<install-dir>``
- * ``-DLLVM_HOST_TRIPLE=arm-linux-gnueabihf``
- * ``-DLLVM_TARGETS_TO_BUILD=ARM``
-
-Note: ``CMAKE_CROSSCOMPILING`` is always set automatically when ``CMAKE_SYSTEM_NAME`` is set. Don't put ``-DCMAKE_CROSSCOMPILING=TRUE`` in your options.
-
-Also note that ``LLVM_HOST_TRIPLE`` specifies the triple of the system
-that the cross built LLVM is going to run on - the flag is named based
-on the autoconf build/host/target nomenclature. (This flag implicitly sets
-other defaults, such as ``LLVM_DEFAULT_TARGET_TRIPLE``.)
+Setting up a sysroot
+--------------------
-If you're compiling with GCC, you can use architecture options for your target,
-and the compiler driver will detect everything that it needs:
+You will need a sysroot that contains essential build dependencies compiled
+for the target architecture. In this case, we will be using CMake and Ninja on
+a Linux host and compiling against a Debian sysroot. Detailed instructions on
+producing sysroots are outside of the scope of this documentation, but the
+following instructions should work on any Linux distribution with these
+pre-requisites:
- * ``-DCMAKE_CXX_FLAGS='-march=armv7-a -mcpu=cortex-a9 -mfloat-abi=hard'``
+ * ``binfmt_misc`` configured to execute ``qemu-user`` for binaries of the
+ target architecture. This is done by installing the ``qemu-user-static``
+ and ``binfmt-support`` packages on Debian-derived distributions.
+ * Root access (setups involving ``proot`` or other tools to avoid this
+ requirement may be possible, but aren't described here).
+ * The ``debootstrap`` tool. This is available in most distributions.
-However, if you're using Clang, the driver might not be up-to-date with your
-specific Linux distribution, version or GCC layout, so you'll need to fudge.
+The following snippet will initialise sysroots for 32-bit Arm, AArch64, and
+64-bit RISC-V (just pick the target(s) you are interested in):
-In addition to the ones above, you'll also need:
+ .. code-block:: bash
- * ``--target=arm-linux-gnueabihf`` or whatever is the triple of your cross GCC.
- * ``'--sysroot=/usr/arm-linux-gnueabihf'``, ``'--sysroot=/opt/gcc/arm-linux-gnueabihf'``
- or whatever is the location of your GCC's sysroot (where /lib, /bin etc are).
- * Appropriate use of ``-I`` and ``-L``, depending on how the cross GCC is installed,
- and where are the libraries and headers.
+ sudo debootstrap --arch=armhf --variant=minbase --include=build-essential,symlinks stable sysroot-deb-armhf-stable
+ sudo debootstrap --arch=arm64 --variant=minbase --include=build-essential,symlinks stable sysroot-deb-arm64-stable
+ sudo debootstrap --arch=riscv64 --variant=minbase --include=build-essential,symlinks unstable sysroot-deb-riscv64-unstable
-You may also want to set the ``LLVM_NATIVE_TOOL_DIR`` option - pointing
-at a directory with prebuilt LLVM tools (``llvm-tblgen``, ``clang-tblgen``
-etc) for the build host, allowing you to them reuse them if available.
-E.g. ``-DLLVM_NATIVE_TOOL_DIR=<path-to-native-llvm-build>/bin``.
-If the option isn't set (or the directory doesn't contain all needed tools),
-the LLVM cross build will automatically launch a nested build to build the
-tools that are required.
+The created sysroot may contain absolute symlinks, which will resolve to a
+location within the host when accessed during compilation, so we must convert
+any absolute symlinks to relative ones:
-The CXX flags define the target, cpu (which in this case
-defaults to ``fpu=VFP3`` with NEON), and forcing the hard-float ABI. If you're
-using Clang as a cross-compiler, you will *also* have to set ``--sysroot``
-to make sure it picks the correct linker.
+ .. code-block:: bash
-When using Clang, it's important that you choose the triple to be *identical*
-to the GCC triple and the sysroot. This will make it easier for Clang to
-find the correct tools and include headers. But that won't mean all headers and
-libraries will be found. You'll still need to use ``-I`` and ``-L`` to locate
-those extra ones, depending on your distribution.
+ sudo chroot sysroot-of-your-choice symlinks -cr .
-Most of the time, what you want is to have a native compiler to the
-platform itself, but not others. So there's rarely a point in compiling
-all back-ends. For that reason, you should also set the
-``TARGETS_TO_BUILD`` to only build the back-end you're targeting to.
-You must set the ``CMAKE_INSTALL_PREFIX``, otherwise a ``ninja install``
-will copy ARM binaries to your root filesystem, which is not what you
-want.
+Configuring CMake and building
+------------------------------
-Hacks
------
+For more information on how to configure CMake for LLVM/Clang,
+see :doc:`CMake`. Following CMake's recommended practice, we will create a
+`toolchain file
+<https://cmake.org/cmake/help/book/mastering-cmake/chapter/Cross%20Compiling%20With%20CMake.html#toolchain-files>`_.
-There are some bugs in current LLVM, which require some fiddling before
-running CMake:
+The following assumes you have a system install of ``clang`` and ``lld`` that
+will be used for cross compiling and that the listed commands are executed
+from within the root of a checkout of the ``llvm-project`` git repository.
-#. If you're using Clang as the cross-compiler, there is a problem in
- the LLVM ARM back-end that is producing absolute relocations on
- position-independent code (``R_ARM_THM_MOVW_ABS_NC``), so for now, you
- should disable PIC:
+First, set variables in your shell session that will be used throughout the
+build instructions:
.. code-block:: bash
- -DLLVM_ENABLE_PIC=False
+ SYSROOT=$HOME/sysroot-deb-arm64-stable
+ TARGET=aarch64-linux-gnu
+ CFLAGS=""
- This is not a problem, since Clang/LLVM libraries are statically
- linked anyway, it shouldn't affect much.
+To customise details of the compilation target or choose a different
+architecture altogether, change the ``SYSROOT``,
+``TARGET``, and ``CFLAGS`` variables to something matching your target. For
+example, for 64-bit RISC-V you might set
+``SYSROOT=$HOME/sysroot-deb-riscv64-unstable``, ``TARGET=riscv64-linux-gnu``
+and ``CFLAGS="-march=rva20u64"``. Refer to documentation such as your target's
+compiler documentation or processor manual for guidance on which ``CFLAGS``
+settings may be appropriate. The specified ``TARGET`` should match the triple
+used within the sysroot (i.e. ``$SYSROOT/usr/lib/$TARGET`` should exist).
-#. The ARM libraries won't be installed in your system.
- But the CMake prepare step, which checks for
- dependencies, will check the *host* libraries, not the *target*
- ones. Below there's a list of some dependencies, but your project could
- have more, or this document could be outdated. You'll see the errors
- while linking as an indication of that.
+Then execute the following snippet to create a toolchain file:
- Debian based distros have a way to add ``multiarch``, which adds
- a new architecture and allows you to install packages for those
- systems. See https://wiki.debian.org/Multiarch/HOWTO for more info.
+ .. code-block:: bash
- But not all distros will have that, and possibly not an easy way to
- install them in any anyway, so you'll have to build/download
- them separately.
+ cat - <<EOF > $TARGET-clang.cmake
+ set(CMAKE_SYSTEM_NAME Linux)
+ set(CMAKE_SYSROOT "$SYSROOT")
+ set(CMAKE_C_COMPILER_TARGET $TARGET)
+ set(CMAKE_CXX_COMPILER_TARGET $TARGET)
+ set(CMAKE_C_FLAGS_INIT "$CFLAGS")
+ set(CMAKE_CXX_FLAGS_INIT "$CFLAGS")
+ set(CMAKE_LINKER_TYPE LLD)
+ set(CMAKE_C_COMPILER clang)
+ set(CMAKE_CXX_COMPILER clang++)
+ set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
+ set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
+ set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
+ set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)
+ EOF
+
+
+Then configure and build by invoking ``cmake``:
- A quick way of getting the libraries is to download them from
- a distribution repository, like Debian (http://packages.debian.org/jessie/),
- and download the missing libraries. Note that the ``libXXX``
- will have the shared objects (``.so``) and the ``libXXX-dev`` will
- give you the headers and the static (``.a``) library. Just in
- case, download both.
+ .. code-block:: bash
- The ones you need for ARM are: ``libtinfo``, ``zlib1g``,
- ``libxml2`` and ``liblzma``. In the Debian repository you'll
- find downloads for all architectures.
+ cmake -G Ninja \
+ -DCMAKE_BUILD_TYPE=Release \
+ -DLLVM_ENABLE_PROJECTS="lld;clang" \
+ -DCMAKE_TOOLCHAIN_FILE=$(pwd)/$TARGET-clang.cmake \
+ -DLLVM_HOST_TRIPLE=$TARGET \
+ -DCMAKE_INSTALL_PREFIX=$HOME/clang-$TARGET \
+ -S llvm \
+ -B build/$TARGET
+ cmake --build build/$TARGET
+
+These options from the toolchain file and ``cmake`` invocation above are
+important:
+
+ * ``CMAKE_SYSTEM_NAME``: Perhaps surprisingly, explicitly setting this
+ variable `causes CMake to set
+ CMAKE_CROSSCOMPIILING <https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING.html#variable:CMAKE_CROSSCOMPILING>`_.
+ * ``CMAKE_{C,CXX}_COMPILER_TARGET``: This will be used to set the
+ ``--target`` argument to ``clang``. The triple should match the triple used
+ within the sysroot (i.e. ``$SYSROOT/usr/lib/$TARGET`` should exist).
+ * ``CMAKE_FIND_ROOT_PATH_MODE_*``: These `control the search behaviour for
+ finding libraries, includes or binaries
+ <https://cmake.org/cmake/help/book/mastering-cmake/chapter/Cross%20Compiling%20With%20CMake.html#finding-external-libraries-programs-and-other-files>`_.
+ Setting these prevents files for the host being used in the build.
+ * ``LLVM_HOST_TRIPLE``: Specifies the target triple of the system the built
+ LLVM will run on, which also implicitly sets other defaults such as
+ ``LLVM_DEFAULT_TARGET_TRIPLE``. For example, if you are using an x86_64
+ host to compile for RISC-V, this will be a RISC-V triple.
+ * ``CMAKE_SYSROOT``: The path to the sysroot containing libraries and headers
+ for the target.
+ * ``CMAKE_INSTALL_PREFIX``: Setting this avoids installing binaries compiled
+ for the target system into system directories for the host system. It is
+ not required unless you are going to use the ``install`` target.
+
+See `LLVM's build documentation
+<https://llvm.org/docs/CMake.html#frequently-used-cmake-variables>`_ for more
+guidance on CMake variables (e.g. ``LLVM_TARGETS_TO_BUILD`` may be useful if
+your cross-compiled binaries only need to support compiling for one target).
+
+Working around a ninja dependency issue
+---------------------------------------
+
+If you followed the instructions above to create a sysroot, you may run into a
+`longstanding problem related to path canonicalization in ninja
+<https://github.com/ninja-build/ninja/issues/1330>_`. GCC canonicalizes system
+headers in dependency files, so when ninja reads them it does not need to do
+so. Clang does not do this, and unfortunately ninja does not implement the
+canonicalization logic at all, meaning for some system headers with symlinks
+in the paths, it can incorrectly compute a non-existing path and consider it
+as always modified.
+
+If you are suffering from this issue, you will find any attempt at an
+incremental build (including the suggested command to build the ``install``
+target in the next section) results in recompiling everything. ``ninja -C
+build/$TARGET -t deps`` shows files in ``$SYSROOT/include/*`` that
+do not exist (as the ``$SYSROOT/include`` folder does not exist) and you can
+further confirm these files are causing ``ninja`` to determine a rebuild is
+necessary with ``ninja -C build/$TARGET -d deps``.
+
+A workaround is to create a symlink so that the incorrect
+``$SYSROOT/include/*`` dependencies resolve to files within
+``$SYSROOT/usr/include/*``. This works in practice for the simple
+cross-compilation use case described here, but is not a general solution.
- After you download and unpack all ``.deb`` packages, copy all
- ``.so`` and ``.a`` to a directory, make the appropriate
- symbolic links (if necessary), and add the relevant ``-L``
- and ``-I`` paths to ``-DCMAKE_CXX_FLAGS`` above.
+ .. code-block:: bash
+ sudo ln -s usr/include $SYSROOT/include
-Running CMake and Building
---------------------------
+Testing the just-built compiler
+-------------------------------
-Finally, if you're using your platform compiler, run:
+Confirm the ``clang`` binary was built for the expected target architecture:
.. code-block:: bash
- $ cmake -G Ninja <source-dir> -DCMAKE_BUILD_TYPE=<type> <options above>
+ $ file -L ./build/aarch64-linux-gnu/bin/clang
+ ./build/aarch64-linux-gnu/bin/clang: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=516b8b366a790fcd3563bee4aec0cdfcb90bb1c7, not stripped
-If you're using Clang as the cross-compiler, run:
+If you have ``qemu-user`` installed you can test the produced target binary
+either by invoking ``qemu-{target}-static`` directly:
.. code-block:: bash
- $ CC='clang' CXX='clang++' cmake -G Ninja <source-dir> -DCMAKE_BUILD_TYPE=<type> <options above>
-
-If you have ``clang``/``clang++`` on the path, it should just work, and special
-Ninja files will be created in the build directory. I strongly suggest
-you to run ``cmake`` on a separate build directory, *not* inside the
-source tree.
+ $ qemu-aarch64-static -L $SYSROOT ./build/aarch64-linux-gnu/bin/clang --version
+ clang version 21.0.0git (https://github.com/llvm/llvm-project cedfdc6e889c5c614a953ed1f44bcb45a405f8da)
+ Target: aarch64-unknown-linux-gnu
+ Thread model: posix
+ InstalledDir: /home/asb/llvm-project/build/aarch64-linux-gnu/bin
-To build, simply type:
+Or, if binfmt_misc is configured (as was necessary for debootstrap):
.. code-block:: bash
- $ ninja
+ $ export QEMU_LD_PREFIX=$SYSROOT; ./build/aarch64-linux-gnu/bin/clang --version
+ clang version 21.0.0git (https://github.com/llvm/llvm-project cedfdc6e889c5c614a953ed1f44bcb45a405f8da)
+ Target: aarch64-unknown-linux-gnu
+ Thread model: posix
+ InstalledDir: /home/asb/llvm-project/build/aarch64-linux-gnu/bin
-It should automatically find out how many cores you have, what are
-the rules that needs building and will build the whole thing.
-
-You can't run ``ninja check-all`` on this tree because the created
-binaries are targeted to ARM, not x86_64.
-
-Installing and Using
+Installing and using
--------------------
-After the LLVM/Clang has built successfully, you should install it
-via:
-
- .. code-block:: bash
-
- $ ninja install
+.. note::
+ Use of the ``install`` target requires that you have set
+ ``CMAKE_INSTALL_PREFIX`` otherwise it will attempt to install in
+ directories under `/` on your host.
-which will create a sysroot on the install-dir. You can then tar
-that directory into a binary with the full triple name (for easy
-identification), like:
+If you want to transfer a copy of the built compiler to another machine, you
+can first install it to a location on the host via:
.. code-block:: bash
- $ ln -sf <install-dir> arm-linux-gnueabihf-clang
- $ tar zchf arm-linux-gnueabihf-clang.tar.gz arm-linux-gnueabihf-clang
+ cmake --build build/$TARGET --target=install
-If you copy that tarball to your target board, you'll be able to use
-it for running the test-suite, for example. Follow the guidelines at
-https://llvm.org/docs/lnt/quickstart.html, unpack the tarball in the
-test directory, and use options:
+This will install the LLVM/Clang headers, binaries, libraries, and other files
+to paths within ``CMAKE_INSTALL_PREFIX``. Then tar that directory for transfer
+to a device that runs the target architecture natively:
.. code-block:: bash
- $ ./sandbox/bin/python sandbox/bin/lnt runtest nt \
- --sandbox sandbox \
- --test-suite `pwd`/test-suite \
- --cc `pwd`/arm-linux-gnueabihf-clang/bin/clang \
- --cxx `pwd`/arm-linux-gnueabihf-clang/bin/clang++
+ tar -czvf clang-$TARGET.tar.gz -C $HOME clang-$TARGET
-Remember to add the ``-jN`` options to ``lnt`` to the number of CPUs
-on your board. Also, the path to your clang has to be absolute, so
-you'll need the `pwd` trick above.
+The generated toolchain is portable, but requires compatible versions of any
+shared libraries it links against. This means using a sysroot that is as
+similar to your target operating system as possible is desirable.
----------------
asb wrote:
Thanks for clarifying - I've added a mention of that and pointed to the CMake variable docs. Further suggestions very welcome - I'm wary of giving very specific advice about tradeoffs as it will depend a lot on the specifics.
https://github.com/llvm/llvm-project/pull/129451
More information about the llvm-commits
mailing list