[cfe-dev] architecture endianness and preprocessor defines
clattner at apple.com
Wed May 7 10:44:58 PDT 2008
On May 7, 2008, at 8:58 AM, james woodyatt wrote:
> One of the things I've long disliked about how GCC works is that its
> developers have still not really sorted out how to handle
> architectures that can operate in either big- or little-endian mode.
> I'd like to know if the LLVM CFE developers have any thoughts on how
> to improve matters here.
You bring up a lot of interesting issues. Some meta answers :)
> Here's what GCC does today, and how that situation produces
> consequences downstream:
> + The various architecture configurations define built-in preprocessor
> definitions like __BIG_ENDIAN__ and __LITTLE_ENDIAN__.
We aim to be GCC compatible with preprocessor directives. This is
important for compatibility with existing code.
> One of the additional hassles with GCC is that its "multilib" feature
> doesn't consistently build the C runtime environment, i.e. crtstuff.c,
> for both big- and little-endian modes. This is why there are all
> those GCC target triples that look like "armeb-netbsd-elf" and
> wrs-vxworks" and "armle-linux-gnu" in the configure script. Notice
> that the suffixes aren't used consistently across operating system
I agree that this is irritating. Two issues: 1) we will support the
GCC target triples, at least when/if people contribute support for
them. 2) clang is explicitly designed to support building a single
tool chain in place that supports multiple targets. The ultimate
goal is that you should be able to configure clang with "--
targets='armeb-netbsd-elf mipsel-wrs-vxworks armle-linux-gnu'" and get
support in the toolchain for all of them. We already have support for
handling this (-arch option and friends). When we bring up the
"libgcc" runtime library stuff, we'll make sure it can be built for
> The suffix on the architecture name ends up getting translated into
> the endianness of the C runtime environment modules used by the linker
> (except when -nostdlib is used... sigh). If it weren't for this,
> you'd be able to build GCC for ARM or MIPS or whatever, without adding
> that suffix to the architecture part of the triple, and the -mbig-
> endian and -mlittle-endian switches would select the proper C runtime
> environment. Sadly, that doesn't happen like it should.
Just because we will support the existing GCC target triples (again,
when/if people contribute support for them) it doesn't mean we can't
support simplified triples also.
> That still leaves the C preprocessor built-ins, which are clearly in
> Clang's domain to manage. Here's what I propose: Clang should define
> a small set of general preprocessor built-ins that identify the CPU
> architecture family specified in the target triple, e.g. __ia32__,
> __x86_64__, __arm__, __powerpc__, __mips__, etc; it should also define
> __LITTLE_ENDIAN__ and __BIG_ENDIAN__ as appropriate, and it should
> offer the -mbig-endian and -mlittle-endian switches for explicitly
> specifying the endianness on architectures that can execute in either
> mode. The command driver can then do the right thing (or the wrong
> thing) as necessary.
We have to support the existing ones. Requiring people to 'port'
their code to clang from GCC is not desirable.
That said, we *can* support nicer and cleaner interfaces as well for
feature queries. Over time, we can encourage people (who don't care
about writing portable code (?)) to use these and/or try to get the
GCC folks to adopt similar features.
More information about the cfe-dev