[LLVMdev] ARM struct byval size > 64 triggers failure

Manman Ren mren at apple.com
Thu Jun 20 10:52:52 PDT 2013


> - "since ABI says the stack pointer needs to be 8 byte aligned at function entry point" (taken from Manman's reply)
>     What will be considered as entry point here?
>     Is it place of SP Adjustments "sub   sp, sp, #16"
>     (Or) Is it place of first user instruction(end of prologue) "ldr     r2, .LCPI0_0"

Eight byte stack alignment is a requirement of the Procedure Call Standard for the ARM Architecture [AAPCS]. This specifies that functions must maintain an eight-byte aligned stack address (for example: 0x00, 0x08, 0x10, 0x18, 0x20) on all external interfaces. In practice this requirement is met if:
At each external interface, the current stack pointer (SP) is a multiple of eight bytes.
Your OS maintains eight-byte stack alignment on its external interfaces, for example, on task switches

<-- from web

Manman

On Jun 20, 2013, at 5:35 AM, Rajesh Viswabramana <rajesh.vis at samsung.com> wrote:

>  
> Hi All,
>  
> I tested earlier with llvm svn 3.3 release source.
> Today I built and tested again with svn trunk, I also got "ldr r1, [r11, #80]", which solves the failure.
> Thanks for all comments.
> It seems similar issue http://llvm.org/bugs/show_bug.cgi?id=15868, fixed already. 
>  
> I have few queries, Just to get my understandings better. Could you please comment on these,
> - "since ABI says the stack pointer needs to be 8 byte aligned at function entry point" (taken from Manman's reply)
>     What will be considered as entry point here?
>     Is it place of SP Adjustments "sub   sp, sp, #16"
>     (Or) Is it place of first user instruction(end of prologue) "ldr     r2, .LCPI0_0"
> -  What considerations will be made from llvm side to make abi compatibile with gcc/other compilers for arm - struct byval size >64 bytes.
>  
>  
> Regards,
> Rajesh
>  
> ------- Original Message -------
> Sender : Stepan Dyatkovskiy<stpworld at narod.ru>
> Date : Jun 20, 2013 03:08 (GMT+09:00)
> Title : Re: [LLVMdev] ARM struct byval size > 64 triggers failure
>  
> Hi Rajesh,
> 
> I'm in some stage of looking what exactly happens. As Manman mentioned 
> r0 is reserved for function return. Though, perhaps somewhere we didn't 
> catch that...
> 
> Would you tell me, please, your llvm revision number?
> 
> -Stepan
> 
> Manman Ren wrote:
> >
> > I missed that the testing case is returning a struct.
> > You are right in VARegSaveSize.
> >
> > For callee:
> > subsp, sp, #16
> > push{r11, lr}
> > movr11, sp
> > subsp, sp, #8
> > strr3, [r11, #20]
> > strr2, [r11, #16]
> > strr1, [r11, #12]
> > ldrr1, [r11, #76]
> >
> > The beginning of the input struct @ sp_at_entry - 16 - 8 + 12 =
> > sp_at_entry -12
> > # of leftover bytes 67-12 = 55
> > r11+76 is @ sp_at_entry - 24 + 76 = sp_at_entry + 52, this is incorrect,
> > it should be at align(55, 4) = 56.
> >
> > For caller:
> > movr0, sp
> > ldrr1, .LCPI1_0
> > strr1, [r0, #56]
> >
> > the 2nd argument is at sp_at_entry + 56, which is correct.
> >
> > On my setup (built from TOT), I got "ldr r1, [r11, #80]" instead of 76.
> >
> > Thanks,
> > Manman
> >
> > On Jun 18, 2013, at 11:31 PM, Rajesh Viswabramana wrote:
> >
> >> Hi all,
> >>
> >>
> >> Thanks for all comments,
> >>
> >> Filed bug report with details,
> >>
> >> http://llvm.org/bugs/show_bug.cgi?id=16368
> >>
> >> @Manman, please find comments, queries inline.
> >>
> >>
> >> Regards,
> >> Rajesh
> >>
> >>
> >> -------*Original Message*-------
> >>
> >> *Sender*: Stepan Dyatkovskiy>
> >>
> >> *Date*: Jun 19, 2013 05:15 (GMT+09:00)
> >>
> >> *Title*: Re: [LLVMdev] ARM struct byval size > 64 triggers failure
> >>
> >>
> >> Hi all,
> >> One more interesting job :-)
> >> I'll look too at this case tomorrow. Today my brain is about to be
> >> exploded..
> >> -Stepan.
> >> 18.06.2013, 22:56, "Manman Ren" >:
> >>> Hi Rajesh,
> >>> The callee code looks okay to me
> >>>>
> >>>> Assembly for check114
> >>>> ---------------------------------------------------------------
> >>>>         sub     sp, sp, #16
> >>>>         push    {r11, lr}
> >>>>         mov     r11, sp
> >>>>         sub     sp, sp, #8
> >>>>         str     r3, [r11, #20]
> >>>>         str     r2, [r11, #16]
> >>>>         str     r1, [r11, #12]
> >>>>         ldr     r1, [r11, #76]
> >>>>
> >>> VARegSaveSize is 16 because we store the first 16 bytes of struct
> >>> byval in r0 to r3.
> >>> For the test code/assembly I mentioned, only r1-r3 are used for
> >>> struct byval. r0 used for return val.
> >>>
> >>> So, NumGPRs = 3, VARegSize = 12, VARegSaveSize =16(4 byte offset),
> >>> access of arg1 is going wrong.
> >>>
> >>> For updated test code:
> >>>
> >>> struct S114 check114 (*int a*, struct S114 arg0, struct S114* arg1) {
> >>>
> >>> .....
> >>> }
> >>>
> >>> Only r1, r2 used for struct byval, NumGPRs = 2, VARegSize = 8,
> >>> VARegSaveSize =8, this case works, able to access arg1.
> >>>
> >>>
> >>> Please correct me, if my above understanding is wrong about NumGPRs,
> >>> VARegSaveSize calculation.
> >>>
> >>>
> >>> Align in computeRegArea is 8 since ABI says the stack pointer needs
> >>> to be 8 byte aligned at function entry point.
> >>> But the second argument does not have to be 8 byte aligned, in fact
> >>> it is 4 byte aligned for i32.
> >>> Ok.
> >>> r11, #76 is equivalent to sp_at_entry + 52 since r11 = spat_entry -
> >>> 16 - 8, which is 4-byte aligned after
> >>> storing the leftover (67-16=51) bytes of struct byval.
> >>> For test code, 3 reg used for struct byval, left over will be (67- 12
> >>> = 55) copied to stack by caller, 55 sets of ldrb,strb in assembly.
> >>> Pasting dump again for locating arg1 from sp_at_entry,
> >>> @entry of check114
> >>>   sp             0xbefff808 0xbefff808 --> sp_at_entry
> >>>
> >>>   At arg1 accessing instruction [0xbefff808 + 52] ->
> >>> [0xbefff83c] ->*0x4071706f*
> >>>   0xbefff7e4: 0x4001ed08 0x40024f90 0x4071706f 0xbefff8a0
> >>>   0xbefff7f4: 0x0000869c 0x00000000 0x3231302f 0x36353433
> >>>   0xbefff804: 0x3a393837 0x3e3d3c3b 0x4241403f 0x46454443
> >>>   0xbefff814: 0x4a494847 0x4e4d4c4b 0x5251504f 0x56555453
> >>>   0xbefff824: 0x5a595857 0x5e5d5c5b 0x6261605f 0x66656463
> >>>   0xbefff834: 0x6a696867 0x6e6d6c6b* 0x4071706f* *0x00010861*
> >>>
> >>> Can you also paste the assembly for the caller side and check whether
> >>> the second argument is stored
> >>> at sp_at_entry+52?
> >>> Please find the attached assembly file.
> >>> As Renato suggested, please file a bug report.
> >>> Filed bug.
> >>> Thanks,
> >>> Manman
> >>> On Jun 18, 2013, at 4:26 AM, Rajesh Viswabramana
> >>> > wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Handling of pass by val of struct size >64 bytes case is seems wrong
> >>>> for arm targets.
> >>>>
> >>>>
> >>>> *Summary:*
> >>>>
> >>>> Incase of struct pass by value for size > 64 along with other
> >>>> function params, failure seen in some corner cases. Access to
> >>>> function params result in wrong stack location access.
> >>>>
> >>>> Stack pointer adjustment done by prologue emitter and offset used to
> >>>> access function params have different logics for calculaton.
> >>>>
> >>>>
> >>>> *Test code
> >>>> *---------------------------------------------------------------
> >>>> #include 
> >>>> struct S114 {
> >>>>   char a[67];
> >>>> }a114[5];
> >>>>
> >>>> struct S114 check114 (struct S114 arg0, struct S114* arg1) {
> >>>>   if(&a114[0] != arg1)                         // arg1 value is wrong
> >>>>     printf( "values %p, %p\n", &a114[0], arg1);
> >>>> }
> >>>> int main () {
> >>>>   int i= 0, j = 0;
> >>>>   for (;j<2; j++)                                    // just filling
> >>>> a114 with some values for identification
> >>>>     for(i=0; i>>>>       memset(&a114[j].a[i],(0x11+i+j*30), sizeof(int));
> >>>>
> >>>>   check114 (a114[1], &a114[0]);         //=> a114[0]  is accessed
> >>>> from wrong location inside check114 function
> >>>> }
> >>>> ---------------------------------------------------------------
> >>>> clang -v
> >>>> clang version 3.3 (tags/RELEASE_33/final)
> >>>> Target: i386-pc-linux-gnu
> >>>> Thread model: posix
> >>>>
> >>>> *Output on arm :
> >>>> *# ./check114.exe
> >>>> values 0x10861, 0x4071706f
> >>>> which is wrong.
> >>>>
> >>>> Assembly for check114
> >>>> ---------------------------------------------------------------
> >>>>         sub     sp, sp, #16
> >>>>         push    {r11, lr}
> >>>>         mov     r11, sp
> >>>>         sub     sp, sp, #8
> >>>>         str     r3, [r11, #20]
> >>>>         str     r2, [r11, #16]
> >>>>         str     r1, [r11, #12]
> >>>>         ldr     r1, [r11, #76]
> >>>>         str     r1, [sp, #4]
> >>>>         .loc    1 7 0 prologue_end
> >>>>         ldr     r2, .LCPI0_0
> >>>>         cmp     r2, r1
> >>>>         beq     .LBB0_2
> >>>>         b       .LBB0_1
> >>>> ---------------------------------------------------------------
> >>>>
> >>>> From reg, stack dump:
> >>>> ------------------------------------------------------------------------------------------------------------------------------
> >>>> @entry of check114
> >>>>   => 0x8398 : sub sp, sp, #16
> >>>>         0x839c : push {r11, lr}
> >>>>   sp             0xbefff808 0xbefff808
> >>>>
> >>>> @if condition
> >>>>        0x83b4 : ldr r1, [r11, #76] ; 0x4c         <---
> >>>> wrong value copied to r1, offset #76 should be #80
> >>>>        0x83b8 : str r1, [sp, #4]
> >>>>        0x83bc : ldr r2, [pc, #44] ; 0x83f0 
> >>>>  => 0x83c0 : cmp r2, r1
> >>>>
> >>>>   r11            0xbefff7f0 -1090521104
> >>>>   sp             0xbefff7e8 0xbefff7e8
> >>>>
> >>>>   Stack dump:
> >>>>   0xbefff7e4: 0x4001ed08 0x40024f90 0x4071706f 0xbefff8a0
> >>>>   0xbefff7f4: 0x0000869c 0x00000000 0x3231302f 0x36353433
> >>>>   0xbefff804: 0x3a393837 0x3e3d3c3b 0x4241403f 0x46454443
> >>>>   0xbefff814: 0x4a494847 0x4e4d4c4b 0x5251504f 0x56555453
> >>>>   0xbefff824: 0x5a595857 0x5e5d5c5b 0x6261605f 0x66656463
> >>>>   0xbefff834: 0x6a696867 0x6e6d6c6b* 0x4071706f*
> >>>> *0x00010861*                 //[R11+4c] -> [0xbefff7f0+4c] ->
> >>>> [0xbefff83c] -> 0x4071706f
> >>>>
> >>>> Correct value is at location {[R11+4c]*+4*} --> 0x00010861, 4 bytes
> >>>> offset going wrong.
> >>>> ------------------------------------------------------------------------------------------------------------------------------
> >>>>
> >>>> When i checked from the ARM Lowering part for generation of
> >>>>   sub sp, sp, #16
> >>>>
> >>>> Emitted by,
> >>>>
> >>>>   if (VARegSaveSize)
> >>>>     emitSPUpdate(isARM, MBB, MBBI, dl, TII, -VARegSaveSize,
> >>>> // --> VARegSaveSize is calculated in computeRegArea
> >>>>                  MachineInstr::FrameSetup)
> >>>>
> >>>> ARMTargetLowering::computeRegArea(..) {
> >>>>   ...
> >>>>   VARegSize = NumGPRs * 4;
> >>>>   VARegSaveSize = (VARegSize + Align - 1) & ~(Align -
> >>>> 1);                 // --> 8 byte alignment done here
> >>>> }
> >>>>
> >>>> Stack pointer decremented to NumGPRs*4 + alignment
> >>>>
> >>>> NumGPRs = 3 registers
> >>>>
> >>>> VARegSaveSize  = 16 (after considering 8 byte alignment )
> >>>>
> >>>>
> >>>> When the offset(#76) for the instruction, "ldr r1, [r11, #76] ;
> >>>> 0x4c"  is calculated, 4 bytes alignment is considered.
> >>>> In prologue stackpointer calculation 8 byte alignment is considered.
> >>>>
> >>>> Due to this mimatch of alignment, If try to access any parameter
> >>>> after byval which results wrong value.
> >>>>
> >>>> Issue(or offset of 4 bytes) wont occur if even number of register
> >>>> used for byval spilling.
> >>>> ex:
> >>>> struct S114 check114 (int a, struct S114 arg0, struct S114* arg1) {
> >>>> // accessing arg1 is fine in this case
> >>>> .....
> >>>> }
> >>>>
> >>>> Could someone comment on below queries about fixing the problem,
> >>>>
> >>>> 1) Is this 8 byte alignment mandatory ?  Is this due to " ARM AAPCS
> >>>> 5.2.1.2 Stack constraints at a public interface" ? Can this be removed?
> >>>>
> >>>> 2) We will leave alignment as it is but in prologue we will adjust
> >>>> SP once again, this is little meaningless.
> >>>>
> >>>> 3) While accessing arg1 we will consider alignment and add extra
> >>>> offset -> looks better.
> >>>>
> >>>>  Offset to access arg1 is calculated by selection DAG that will be
> >>>> target independent. But Alignment adjustment should be done by
> >>>> target lowering. Any suggestions on how to fix this ?
> >>>>
> >>>> Regards,
> >>>> Rajesh
> >>>>
> >>>> <201306181656803_BEI0XT4N.gif>
> >>>>
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> LLVMdev at cs.uiuc.edu
> >>>> http://llvm.cs.uiuc.edu
> >>>> 
> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>> ,
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVMdev at cs.uiuc.edu
> >>> http://llvm.cs.uiuc.edu
> >>> 
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>
> >>
> >>
> >>
> >>
> >>
> >>
> >> <201306191201558_BEI0XT4N.gif>
> >>
> >> 
> >
> 
>  
>  
>  
>  
>  
>  
> <201306201804240_BEI0XT4N.gif>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130620/cf3f17de/attachment.html>


More information about the llvm-dev mailing list