ARM, PR15868 Release-blocker fix. ByVal parameters padding fix.

Stepan Dyatkovskiy stpworld at narod.ru
Tue May 14 07:30:50 PDT 2013


Hello,

Evan wrote:
 > The comment doesn't parse. "in when"?
Sorry :-)

 > But I don't see check for stack alignment anywhere in the code.
In reworked patch I added it. Though, I moved main changes from 
StoreByValRegs to computeRegArea, as Manman proposed.

Manman wrote:
 > This will cause the callee to update SP by 20 bytes (ArgRegsSaveSize 
will be 4 + 4 + 4*3), which means SP will not be 8-byte aligned.
Yes, SP will be 4 byte aligned. Though inside the routing we use it as
very simplified heap, and..
Stack is double-word aligned (SP mod 8 = 0) for *public interfaces* 
only, while for other cases it is word aligned (SP mod 4 = 0). Then, 
perhaps, there is nothing criminal to use offsets like #12 or #4 inside 
the routine? Though, of course, params with types like i64, or "double" 
will be 8 bytes aligned everywhere.

 > It is a little weird to have a hard-coded ArgSize 4 here for a 
variable argument.
Hm.. This parameter is rudimentary for VA functions... So, hard-code 
here 0 then?

Just in case, a little more details...
"Byval" parameters are recovered with padding. This trick allows to 
concatenate GPRs part with stack part. But, even before byval patches 
(was posted earlier), there was an issue, that has been registered as 
PR15868.

Consider next example:
%struct_t = type { i32, i32 }

; Note p3 and p4 are MemLocs.
define void @check227(
   i32 %p0,             ; R0
   i32 %p1,             ; R1
   i32 %p2,             ; R2
   %struct_t* byval p3, ; R3, [SP+0, SP+4), LocMemOffset = 0
   %struct_t* p4        ; SP+4,             LocMemOffset = 8
)

Currently LowerFormalArguments does the next:

p3: ArgRegsSize = 4, ArgRegsSaveSize = 8, parameter is split between 
GPRs and stack; stack part is associated with FrameIndex with offset = 4.
p4: is pointer, its value is stored in stack, FrameIndex's offset = 8 
(Wrong, must be 12)

For p4 we should take into account: p3 is padded with 4 bytes, so proper 
offset is 12, not 8.

Relative to case, when byval parameter small enough to be in GPRs only.
We don't need to concatenate it with stack part. So, we don't need to 
pad it. Inside the routine we can store it only using constraint "SP mod 
4 = 0". I can't see any reason to extend ArgRegsSaveArea in this case.
I may be wrong in my last conclusion, so here it would be good to see 
your confirmation/disagree.

I have attached reworked patch + two examples (.c and .ll). Just try run 
.c file with -O2 -S, with clang and gcc, and compare outputs (O2 allows 
to strip out some unnecessary instructions here).

Thanks for reviews!

-Stepan.

Manman Ren wrote:
>
> On May 13, 2013, at 1:28 PM, Evan Cheng wrote:
>
>> Hi Stepan,
>>
>> Your explanation below seems to indicate this is only an issue when stack alignment is 8 (or smaller?). But I don't see check for stack alignment anywhere in the code.
>    unsigned Align = MF.getTarget().getFrameLowering()->getStackAlignment();
>    ArgRegsSize = NumGPRs * 4;
>    ArgRegsSaveSize = (ArgRegsSize + Align - 1) & ~(Align - 1);
> The ArgRegsSaveSize is different from ArgRegsSize when stack alignment is 8 or larger.
> And padding is added when they differ.
>
> Thanks,
> Manman
>>
>> Other nitpicks:
>> +    // If parameter was splitted between stack and GPRs...
>> Typo: s/splitted/split
>>
>> -    return MFI->CreateFixedObject(4, ArgOffset, !ForceMutable);
>> +    return MFI->CreateFixedObject(
>> +        4, AFI->getStoredByValParamsPadding() + ArgOffset, !ForceMutable);
>>
>> You should just use Padding in place of AFI->getStoredByValParamsPadding() here.
>>
>> +  /// StByValParamsPadding - in when we store params in memory
>> +  /// sometimes we need to insert gap before parameter start address.
>> +  ///
>>
>> The comment doesn't parse. "in when"?
>>
>> Evan
>>
>> On May 13, 2013, at 6:26 AM, Stepan Dyatkovskiy <STPWORLD at narod.ru> wrote:
>>
>>> Hi all,
>>>
>>> In case when stack alignment is 8 and GPRs parameter part size is not N*8: we add padding to GPRs part, so part's last byte must be recovered at address K*8-1.
>>> We need to do it, since remained (stack) part of parameter starts from address K*8, and we need to "attach" "GPRs head" without gaps to it:
>>> Stack:
>>> |---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
>>> [ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...
>>>
>>> Note, once we added padding we need to correct *all* Arg offsets that are going after padded one.
>>> That's why we need this fix: Arg offsets were never corrected before this patch. See new test-case included in patch.
>>>
>>> We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8.
>>>
>>> Please find the patch in attachment.
>>>
>>> This patch reduces stack usage for some cases:
>>> We can reduce ArgRegsSaveArea since inner 4 bytes sized byval params my be "packed" with alignment 4.
>>>
>>> This patch also fixes PR15868 release-blocking issue.
>>>
>>> -Stepan.
>>>
>>> <pr15868-2013-05-13-2.patch>_______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: attri_16_simple3.2.c
Type: text/x-csrc
Size: 574 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130514/7a0f4d92/attachment.c>
-------------- next part --------------
;RUN: llc -mtriple=arm-linux-gnueabihf < %s | FileCheck %s

%struct_t = type { i32, i32 }

; Note p3 and p4 are MemLocs.

define void @check227(
  i32 %p0,              ; R0
  i32 %p1,              ; R1
  i32 %p2,              ; R2
  %struct_t* byval %p3, ; R3, [SP+0, SP+4), LocMemOffset = 0
  %struct_t* %p4        ; SP+4,             LocMemOffset = 8
)
; Issue PR15868, fired at least from llvm 3.2:
; p3: ArgRegsSize = 4, ArgRegsSaveSize = 8, FrameIndex offset = 4
; p4: FrameIndex offset = 8 (Wrong)
;
; For p4 we should take into account that p3 was padded with 4 bytes:
; Proper:
; p4: FrameIndex offset = 12


{
entry:
  %0 = ptrtoint %struct_t* %p3 to i32

;CHECK:	str	r3, [sp, #12]
;CHECK:	bl	useInt
  call void @useInt(i32 %0)
  %1 = ptrtoint %struct_t* %p4 to i32

;CHECK:	ldr	r0, [sp, #20]
;CHECK:	bl	useInt
  call void @useInt(i32 %1)

  ret void
}

declare void @useInt(i32)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr15868-2013-05-14.patch
Type: text/x-diff
Size: 9362 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130514/7a0f4d92/attachment.patch>


More information about the llvm-commits mailing list