[llvm-commits] PATCH: Change X86 function prologue/epilogue in case the stack is realigned

Alexey Samsonov samsonov at google.com
Fri Jul 13 09:24:44 PDT 2012


On Fri, Jul 13, 2012 at 2:52 PM, Alexey Samsonov <samsonov at google.com>wrote:

> Hi, Chad!
>
> On Fri, Jul 13, 2012 at 12:03 AM, Chad Rosier <mcrosier at apple.com> wrote:
>
>>
>> On Jul 12, 2012, at 12:54 PM, Alexey Samsonov wrote:
>>
>> Thanks for your comments! I'll fix the patch and sent the updated version
>> tomorrow.
>>
>> Fixed. (updated patch attached and uploaded to
> http://codereview.appspot.com/6352105/). More comments below.
>

And one more update - we should reflect the change in order of stack
alignment / pushing registers,
while calculating the frame size, so that stack would indeed be
properly aligned.
http://codereview.appspot.com/6352105/


>
>
>> I also don't know a case where pushing the registers before the stack
>> alignment may cause any troubles (I just checked that gcc sometimes behaves
>> this way, but it is not Windows, and no xmm registers were involved there).
>> Do you think that we may get away with documenting
>> this concern in comments?
>>
>>
>> We should definitely leave a comment in the event that this change causes
>> subtle problems.
>>
>
> Okay, so I found the following comment in
> X86FrameLowering::spillCalleeSavedRegisters
>   // Make XMM regs spilled. X86 does not have ability of push/pop XMM.
>   // It can be done by spilling XMMs to stack frame.
>   // Note that only Win64 ABI might spill XMMs.
>
> So, the XMM regs are saved in stack slots, and aren't pushed/popped. I've
> added a reference to this code
> and commented our concern in emitPrologue() anyway.
>
>
>>
>> Jakob commented on a trick that ARM does in a similar situation.
>>  Basically, there is an unaligned spill area and an aligned spill area (for
>> performance reasons).  I haven't looked at the code, however.  This is
>> mostly just a side comment, but maybe you could glean some insight and
>> possible pitfalls by looking at the ARM implementation?
>>
>
> Thanks for your advice about the ARM code - the comments there say that
> it's a bad idea to emit consecutive instructions
> mov %rbp, %rsp
> sub $size_of_pushed_regs, %rsp
> as the interrupt may happen between these two instructions, which can
> spoil the callee-saved registers. I've substituted it with
> lea $size_of_pushed_regs(%rbp), %rsp
> (see the expected asm in tests) and simplified code in emitEpilogue() a
> bit.
>
>
>>
>>  Chad
>>
>>
>> On Thu, Jul 12, 2012 at 10:04 PM, Chad Rosier <mcrosier at apple.com> wrote:
>>
>>> Alexey,
>>> Is there a case where we would ever need the stack realigned before the
>>> pushes?  This came to mind prior to my comments below, but I didn't think
>>> it was possible.  Then Eric dropped by my office and made a similar
>>> comment.  That got me wondering, so I briefly chatted with Jakob.  He
>>> pointed out that in Windows there are vector callee-saved registers (xmm
>>> regs).  The only issue would be if these need to be saved and the alignment
>>> is below 16 bytes.  We weren't sure if this could ever happen, but just as
>>> a precaution I though I would comment.
>>>
>>
>>
>>>
>>>  Chad
>>>
>>>
>>> On Jul 12, 2012, at 9:53 AM, Chad Rosier wrote:
>>>
>>>
>>> On Jul 12, 2012, at 9:02 AM, Alexey Samsonov wrote:
>>>
>>> Hi!
>>>
>>> This patch is intended to fix the exception handling in case of
>>> realigned stack (see more details in recent discussion with mcrosier at on r160002).
>>>
>>>
>>> This looks to be a better solution then the prior one we discussed, so
>>> I'm all for it.
>>> In fact, it will improve the performance in the case of stack
>>> realignment + VLAs because we don't restore the SP twice.
>>>
>>> Before this change, the prologue and epilogue looks like this:
>>> push %rbp
>>> mov %rsp, %rbp
>>> and $alignment %rsp   <--- align stack
>>> push %reg1   <----- save regs on stack
>>> push %reg2
>>> sub $size %rsp
>>> <...>
>>> add $size %rsp
>>> pop %reg2  <------- restore regs
>>> pop %reg1
>>> mov %rbp, %rsp <--- restore rsp
>>> pop %rbp
>>>
>>> The problem is: the register values are incorrectly restored in
>>> exception handling - emiited call frame information gives offsets of
>>> registers from frame pointer (%rbp),
>>> which is wrong, as their actual locations depend on the alignment result.
>>>
>>> I suggest to modify the prologue in the following way:
>>> push %rbp
>>> mov %rsp, %rbp
>>> push %reg1   <----- save regs on stack
>>> push %reg2
>>> and $alignment %rsp   <--- align stack
>>> sub $size %rsp
>>> <...>
>>> mov %rbp, %rsp
>>> sub $total_size_of_saved_regs, %rsp
>>> pop %reg2  <------- restore regs
>>> pop %reg1
>>> pop %rbp
>>>
>>> In this way the offsets from %rbp will be valid, and the EH won't be
>>> broken. (you may also see the emiited asm in the tests).
>>>
>>>
>>> Nice!
>>>
>>> Just a few comments on the patch:
>>> 1. For the new test case, pr11468.ll, would you mind adding CHECKs for
>>> the epilogue?  It just gives a more complete picture of what's going on.
>>>
>>> 2. In dynamic-allocas-VLAs.ll
>>>
>>> Because this is being removed from t4:
>>> -; CHECK: addq $[[STACKADJ]], %rsp
>>>
>>> Would you mind simplifying the earlier check:
>>>  ; CHECK: subq $[[STACKADJ:[0-9]+]], %rsp
>>>
>>> to this:
>>>  ; CHECK: subq ${{[0-9]+}}, %rsp
>>>
>>> And for this CHECK:
>>> +; CHECK: addq ${{[0-9]*}}, %rsp ## imm = 0xFFFFFFF0
>>>
>>> Use a + instead of a *; I assume we should see at least one digit.
>>>  Also, remove the ## comment unless you feel it's informative.
>>>
>>> The above applies to the t7 test case as well.
>>>
>>> And for t9 please removed the ## comments unless you feel it's
>>> informative.
>>>
>>> Looks good otherwise.
>>>
>>>  Chad
>>>
>>> Code review: http://codereview.appspot.com/6352105/
>>>
>>> --
>>> Alexey Samsonov, MSK
>>>
>>> <zdiff.pr11468>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Alexey Samsonov, MSK
>>
>>
>>
>
>
> --
> Alexey Samsonov, MSK
>
>


-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120713/0f1d4c87/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: issue6352105_5003.diff
Type: application/octet-stream
Size: 10387 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120713/0f1d4c87/attachment.obj>


More information about the llvm-commits mailing list