[LLVMdev] endian independence

Sun Oct 26 18:33:32 PDT 2008

On Oct 21, 2008, at 2:27 AM, Jay Foad wrote:

> Hi,
>
> I'd like to use LLVM to compile and optimise code when I don't know
> whether the target CPU is big- or little-endian. This would allow me
> to create a single optimised LLVM bitcode binary of an application,
> and then run it through a JIT compiler on systems of differening
> endianness.

Ok.

> I realise that in general the LLVM IR depends on various
> characteristics of the target; I'd just like to be able to remove this
> dependency for the specific case of unknown target endianness.

Sure.  In practice, it should be possible to produce target- 
independent LLVM IR if you have a target-independent input language.   
The trick is making it so that the optimizers preserve this property.   
Endianness is only one piece of this puzzle.

> 3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
> sure that the conversion from GCC trees to LLVM IR doesn't depend on
> endianness. This seems to be fairly straightforward, *except* for
> access to bitfields, which is a bit convoluted.

This will never work for llvm-gcc.  To much target-specific stuff is  
already folded before the llvm backend is even involved.

> I'm already working on this myself. Would you be interested in having
> this work contributed back to LLVM?

If this were to better support target independent languages, it would  
be very useful.  If you're just trying to *reduce* the endianness  
assumptions that leak through, I don't think it's a good approach.   
There is just no way to solve this problem with C.  By the time the  
preprocessor has run, your C code has already had #ifdef  
__LITTLE_ENDIAN__ etc evaluated, for example.

How do you propose to handle things like:

struct foo {
#ifdef __LITTLE_ENDIAN__
   int x, y;
#else
   int y, x;
#endif
};

-Chris