[LLVMdev] question about alignment of structures on the stack (arm 32)
Alexey Perevalov
alexey.perevalov at hotmail.com
Tue Apr 21 08:54:03 PDT 2015
Hello Tim, thanks for response
----------------------------------------
> Date: Mon, 20 Apr 2015 11:45:03 -0700
> Subject: Re: [LLVMdev] question about alignment of structures on the stack (arm 32)
> From: t.p.northover at gmail.com
> To: alexey.perevalov at hotmail.com
> CC: llvmdev at cs.uiuc.edu
>
> On 20 April 2015 at 11:09, Alexey Perevalov
> <alexey.perevalov at hotmail.com> wrote:
>> And before printf call I see an argument preparation, and one of the most interesting instruction
>>
>> orr r3, r2, #4 ;for address of range.length
>
> This is certainly odd, and I can't reproduce the behaviour here. Even
> if the stack itself is 8-byte aligned (it's not on iOS), that struct
> would usually only be 4-byte aligned. LLVM shouldn't be using "orr"
> there.
Yes, you're right, it's odd ).
Sorry I didn't clearly described my environment.
I'm using MachO loader (https://github.com/LubosD/darling/). I'm trying to make it work on ARM.
The scenario is to load MachO binary (e.g. compiled in xCode) that binary is invoking function from
ELF library which implements libobjc2 and CoreFoundation.
in MachO on the ARM stack is 4-bytes aligned. Code produced for ELF expects 8-bytes alignment.
So in 50% cases when call made from MachO to ELF stack pointer register contains not a 8-bytes aligned address.
Even in case of trivial call
NSLog(@"Test string") from MachO
it leads to -[NSString getCharacters:]
------
-(void)getCharacters:(unichar *)unicode {
NSRange range={0,[self length]};
[self getCharacters:unicode range:range];
}
------
when "range" is copying by value, and second field of "range" is evaluated incorrectly,
its address evaluated as address of the structure itself.
because of orr r3, r2, #4,
The minimum example I think is:
#include <stdio.h>
typedef struct
{
int a;
char b;
} MyStruct;
int main(void) {
MyStruct mStruct = {11, 100};
printf("%p, %p\n", &mStruct.a, &mStruct.b);
return 0;
}
compile it by clang
----
clang version 3.3 (tags/RELEASE_33/final)
Target: armv7l-unknown-linux-gnueabi
Thread model: posix
-----
And we get following code of assembler language:
main:
push {r11, lr}
mov r11, sp
sub sp, sp, #24
mov r0, #0
str r0, [r11, #-4]
add r1, sp, #8
movw r2, :lower16:.Lmain.mStruct
movt r2, :upper16:.Lmain.mStruct
vldr d16, [r2]
vstr d16, [sp, #8]
orr r2, r1, #4
movw r3, :lower16:.L.str
movt r3, :upper16:.L.str
str r0, [sp, #4]
mov r0, r3
bl printf
ldr r1, [sp, #4]
str r0, [sp]
mov r0, r1
mov sp, r11
pop {r11, pc}
r2 populates by r1 plus 4 (but plus here is optimized). I think you know it better than me ;)
And if address of mStruct mod 4 = 0 and != mod 8, I got r2 the same as r1.
Due I can't modify MachO binaries, I'm looking for a way to avoid orr and use add instruction here.
Maybe it will not solve all of my problems due difference in ABI, I suggest it's the easiest way.
I found -mstack-alignment= options, and I tried 4 value there for ELF build, but orr still used. BTW for x86_64 it worked, both on linux and mac.
Another way, I think, it's make realignment inside all of ELF function, here could be a performance penalty, I tried -mstackrealign, but it wasn't lead to 8-bytes aligned stack, I mean sp wasn't aligned to 0x.....0/8,
as well as address of structure on the stack. Also I tried -mstrict-align.
So I assume, somewhere should be patches for llvm, which could do it )
I not yet tested some __attribute__((pcs("aapcs")))/-target-abi, maybe there is magic pcs attribute, and I could apply it for dangerous function, but I would prefer to solve that problem in general.
>
> Do you have a self-contained example (code, compiler version & command
> line flags)?
>
> Cheers.
>
> Tim.
Best regards,
Alexey
More information about the llvm-dev
mailing list