[LLVMdev] endian independence

Tue Oct 21 02:27:15 PDT 2008

Hi,

I'd like to use LLVM to compile and optimise code when I don't know
whether the target CPU is big- or little-endian. This would allow me
to create a single optimised LLVM bitcode binary of an application,
and then run it through a JIT compiler on systems of differening
endianness.

I realise that in general the LLVM IR depends on various
characteristics of the target; I'd just like to be able to remove this
dependency for the specific case of unknown target endianness.

Here's a sketch of how it would work:

1. Extend TargetData::isBigEndian() and LLVM bitcode's "target data
layout string" so that endianness is represented as either big, little
or unknown. (I see there's already support for something like this in
Module::getEndianness().)

2. For optimisations (like parts of SRA) that depend on knowing the
target endianness, restrict or disable them as necessary if the target
endianness is unknown. I think this will only affect a small handful
of optimisations.

3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
sure that the conversion from GCC trees to LLVM IR doesn't depend on
endianness. This seems to be fairly straightforward, *except* for
access to bitfields, which is a bit convoluted.

4. In llvm-gcc, if the LLVM backend reports unknown endianness, make
sure that GCC's optimisations on trees don't depend on endianness.

5. Have the linker refuse to link a big-endian module with a
little-endian one, but allow linking a module of unknown endianness
with a module of any endianness at all. (I think this might work
already.)

I'm already working on this myself. Would you be interested in having
this work contributed back to LLVM?

Thanks,
Jay.