[cfe-dev] [llvm-dev] [RFC] Rearchitect Gnu toolchain driver to simplify multilib support
Reid Kleckner via cfe-dev
cfe-dev at lists.llvm.org
Wed Oct 3 15:02:12 PDT 2018
I think this is more of a cfe-dev discussion, so sending my reply there.
I agree, the current situation is a mess. We've basically attempted to
codify gnu multilib rules in giant piles of C++ code again and again, but
we can never keep up with the changes that GCC and then distros make. The
only way to win is to quit the game and pass the buck to the vendor or
user. That way, whenever someone complains about clang's inability to find
a header or lib, we can say with a sigh, "sorry we couldn't find it, as a
workaround, patch the config file next to clang," and not, "sorry we missed
it, hack in some more C++ workarounds and build your own compiler."
This seems like a two part project:
1. Define a config file format morally equivalent to spec files that we can
2. Write some scripts that interrogate a GCC installation to generate those
I think we would want to document explicitly that the config file format is
not intended to be forwards or backwards compatible. It's purpose is to
allow vendors to customize header and library search logic without hacking
clang's C++ logic. The idea is that clang will attempt to make one for you,
get it right 90% of the time, and let you pick up the pieces when it fails.
On Wed, Oct 3, 2018 at 10:14 AM Frank Schaefer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
> I've been poking around with llvm+clang+compiler-rt, trying to get it
> working on Linux ARM soft-float (yes, ARM soft-float support is pretty
> broken). Along the way I tried writing a multilib toolchain driver
> for ARM soft/hard float, with only partial success. For reference see
> One thing I noticed while doing this (and a few other people seem to
> agree on) is that the entire Gnu toolchain driver set could be greatly
> simplified. So far, it seems like every time someone has encountered
> a new multilib case (either a new arch or a new distro arrangement),
> the response has been to pile on another custom multilib driver, or
> add a bunch of corner-case codepaths to an existing driver. That's
> been done so many times that the existing driver set is honestly
> starting to collapse under its own weight. :-(
> I'm now contemplating what it would take to reduce the entire driver
> set to something that simply figures out all the multilib/multiarch
> distinctions by querying the existing gcc installation. This could
> theoretically cover all Gnu multilib cases in a single codepath.
> Some background:
> Current GNU toolchains (gcc+glibc+binutils) tend to encapsulate all
> multilib knowledge in gcc, including:
> * What flags trigger a specific multilib selection
> * What directories are associated with a particular multilib selection
> (what we know as osSuffix()/gccSuffix())
> * What run-time linker (/llib/ld-<arch>.so.<ver>) to use for a
> particular multilib selection
> This is highly customizable at gcc build time via a bunch of
> arch+OS+ABI configuration fragments in the "gcc/config" directory of
> the gcc source tree, and a lot of Linux distros have taken their own
> liberties with this configuration. That's part of why clang's Gnu
> toolchain driver is in the state it's in.
> The rough outline of what I would propose:
> 1. clang's CMakeLists can scan the spec tokens for a selected gcc
> installation (available via "gcc -dumpspecs") and pick out the
> important tokens (so far I know this includes "*multilib",
> "*multilib_matches", "*multilib_defaults", "*multilib_options", and
> 2. clang's Gnu driver can be re-coded to parse the relevant spec tokens.
> 3. clang's Gnu driver can build up a complete unified MultilibSet
> based on these tokens.
> Some potential complications I anticipate:
> 1. I don't know how consistently gcc has used these spec tokens, or
> how the formatting has evolved over time. Mimicking the current (gcc
> 8.2.0) format seems sensible, but what we pull from older gcc
> installations may not comport with what we expect.
> 2. I don't see anything in the spec tokens that describes system
> header arrangement. Vanilla multilib-enabled gcc seems to honor
> /usr/include/<os-suffix> (where <os-suffix> seems to conform to the
> output of "gcc <flags> -print-multiarch"). Note that this doesn't
> necessarily match the osSuffix; I've produced functional GNU
> toolchains that honor a standard-triple osSuffix, but don't honor
> _anything_ like it under /usr/include.
> 3. g++, OTOH, expects all C++ headers to be under
> /usr/include/c++/<version>. Vanilla g++ keeps some headers further
> subbed under <os-suffix>, with some of those further subbed again
> under <gcc-suffix> for non-default multilib cases. Just to complicate
> things, Debian/Ubuntu g++ has apparently been adapted to employ the
> /usr/include/<os-suffix> for multilib-specific C++ headers. If other
> distros do their own thing with this, then I see no straightforward
> way to autodetect anything but a few obvious cases.
> To address the above complications, I would suggest adding CMake
> options for users to supply their own multilib descriptor tokens, in
> case whatever's in gcc specs doesn't work for them. We might even
> allow for an extra token or two to better describe C/C++ header
> This would all require a LOT of planning and testing, especially
> across the multiple targets/distros the Gnu toolchain driver currently
> supports. I'm not sure how to access suitable testbeds for a lot of
> it (I count myself lucky just to have a reasonably-powerful ARM
> build-box). At least initially, I think we would have to keep the old
> hodgepodge driver code around alongside the new unified driver code.
> "If a server dies in a server farm and no one pings it, does it still
> cost four figures to fix?"
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev