[LLVMdev] endian independence

Thu Nov 6 02:26:19 PST 2008

Hi,

Chris Lattner wrote:
> On Oct 21, 2008, at 2:27 AM, Jay Foad wrote:
>
>   
>> Hi,
>>
>> I'd like to use LLVM to compile and optimise code when I don't know
>> whether the target CPU is big- or little-endian. This would allow me
>> to create a single optimised LLVM bitcode binary of an application,
>> and then run it through a JIT compiler on systems of differening
>> endianness.
>>     
>
> Ok.
>
>   
>> I realise that in general the LLVM IR depends on various
>> characteristics of the target; I'd just like to be able to remove this
>> dependency for the specific case of unknown target endianness.
>>     
>
> Sure.  In practice, it should be possible to produce target- 
> independent LLVM IR if you have a target-independent input language.   
> The trick is making it so that the optimizers preserve this property.   
> Endianness is only one piece of this puzzle.
>
>   
>> 3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
>> sure that the conversion from GCC trees to LLVM IR doesn't depend on
>> endianness. This seems to be fairly straightforward, *except* for
>> access to bitfields, which is a bit convoluted.
>>     
>
> This will never work for llvm-gcc.  To much target-specific stuff is  
> already folded before the llvm backend is even involved.
>
>   
>> I'm already working on this myself. Would you be interested in having
>> this work contributed back to LLVM?
>>     
>
> If this were to better support target independent languages, it would  
> be very useful.  If you're just trying to *reduce* the endianness  
> assumptions that leak through, I don't think it's a good approach.   
> There is just no way to solve this problem with C.  By the time the  
> preprocessor has run, your C code has already had #ifdef  
> __LITTLE_ENDIAN__ etc evaluated, for example.
>
> How do you propose to handle things like:
>
> struct foo {
> #ifdef __LITTLE_ENDIAN__
>    int x, y;
> #else
>    int y, x;
> #endif
> };
>   
Define a fixed endianess as in-memory representation and let the optimizer
at JIT time optimize the shuffling access patterns away, if possible.

E.g. either all big-endian as it's the network order, or all 
little-endian because
more "not so well written" software would continue to just work.

-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name