[llvm-commits] PATCH: Change X86 function prologue/epilogue in case the stack is realigned

Alexey Samsonov samsonov at google.com
Fri Jul 13 03:52:12 PDT 2012


Hi, Chad!

On Fri, Jul 13, 2012 at 12:03 AM, Chad Rosier <mcrosier at apple.com> wrote:

>
> On Jul 12, 2012, at 12:54 PM, Alexey Samsonov wrote:
>
> Thanks for your comments! I'll fix the patch and sent the updated version
> tomorrow.
>
> Fixed. (updated patch attached and uploaded to
http://codereview.appspot.com/6352105/). More comments below.


> I also don't know a case where pushing the registers before the stack
> alignment may cause any troubles (I just checked that gcc sometimes behaves
> this way, but it is not Windows, and no xmm registers were involved there).
> Do you think that we may get away with documenting
> this concern in comments?
>
>
> We should definitely leave a comment in the event that this change causes
> subtle problems.
>

Okay, so I found the following comment in
X86FrameLowering::spillCalleeSavedRegisters
  // Make XMM regs spilled. X86 does not have ability of push/pop XMM.
  // It can be done by spilling XMMs to stack frame.
  // Note that only Win64 ABI might spill XMMs.

So, the XMM regs are saved in stack slots, and aren't pushed/popped. I've
added a reference to this code
and commented our concern in emitPrologue() anyway.


>
> Jakob commented on a trick that ARM does in a similar situation.
>  Basically, there is an unaligned spill area and an aligned spill area (for
> performance reasons).  I haven't looked at the code, however.  This is
> mostly just a side comment, but maybe you could glean some insight and
> possible pitfalls by looking at the ARM implementation?
>

Thanks for your advice about the ARM code - the comments there say that
it's a bad idea to emit consecutive instructions
mov %rbp, %rsp
sub $size_of_pushed_regs, %rsp
as the interrupt may happen between these two instructions, which can spoil
the callee-saved registers. I've substituted it with
lea $size_of_pushed_regs(%rbp), %rsp
(see the expected asm in tests) and simplified code in emitEpilogue() a bit.


>
>  Chad
>
>
> On Thu, Jul 12, 2012 at 10:04 PM, Chad Rosier <mcrosier at apple.com> wrote:
>
>> Alexey,
>> Is there a case where we would ever need the stack realigned before the
>> pushes?  This came to mind prior to my comments below, but I didn't think
>> it was possible.  Then Eric dropped by my office and made a similar
>> comment.  That got me wondering, so I briefly chatted with Jakob.  He
>> pointed out that in Windows there are vector callee-saved registers (xmm
>> regs).  The only issue would be if these need to be saved and the alignment
>> is below 16 bytes.  We weren't sure if this could ever happen, but just as
>> a precaution I though I would comment.
>>
>
>
>>
>>  Chad
>>
>>
>> On Jul 12, 2012, at 9:53 AM, Chad Rosier wrote:
>>
>>
>> On Jul 12, 2012, at 9:02 AM, Alexey Samsonov wrote:
>>
>> Hi!
>>
>> This patch is intended to fix the exception handling in case of realigned
>> stack (see more details in recent discussion with mcrosier@ on r160002).
>>
>>
>> This looks to be a better solution then the prior one we discussed, so
>> I'm all for it.
>> In fact, it will improve the performance in the case of stack realignment
>> + VLAs because we don't restore the SP twice.
>>
>> Before this change, the prologue and epilogue looks like this:
>> push %rbp
>> mov %rsp, %rbp
>> and $alignment %rsp   <--- align stack
>> push %reg1   <----- save regs on stack
>> push %reg2
>> sub $size %rsp
>> <...>
>> add $size %rsp
>> pop %reg2  <------- restore regs
>> pop %reg1
>> mov %rbp, %rsp <--- restore rsp
>> pop %rbp
>>
>> The problem is: the register values are incorrectly restored in exception
>> handling - emiited call frame information gives offsets of registers from
>> frame pointer (%rbp),
>> which is wrong, as their actual locations depend on the alignment result.
>>
>> I suggest to modify the prologue in the following way:
>> push %rbp
>> mov %rsp, %rbp
>> push %reg1   <----- save regs on stack
>> push %reg2
>> and $alignment %rsp   <--- align stack
>> sub $size %rsp
>> <...>
>> mov %rbp, %rsp
>> sub $total_size_of_saved_regs, %rsp
>> pop %reg2  <------- restore regs
>> pop %reg1
>> pop %rbp
>>
>> In this way the offsets from %rbp will be valid, and the EH won't be
>> broken. (you may also see the emiited asm in the tests).
>>
>>
>> Nice!
>>
>> Just a few comments on the patch:
>> 1. For the new test case, pr11468.ll, would you mind adding CHECKs for
>> the epilogue?  It just gives a more complete picture of what's going on.
>>
>> 2. In dynamic-allocas-VLAs.ll
>>
>> Because this is being removed from t4:
>> -; CHECK: addq $[[STACKADJ]], %rsp
>>
>> Would you mind simplifying the earlier check:
>>  ; CHECK: subq $[[STACKADJ:[0-9]+]], %rsp
>>
>> to this:
>>  ; CHECK: subq ${{[0-9]+}}, %rsp
>>
>> And for this CHECK:
>> +; CHECK: addq ${{[0-9]*}}, %rsp ## imm = 0xFFFFFFF0
>>
>> Use a + instead of a *; I assume we should see at least one digit.  Also,
>> remove the ## comment unless you feel it's informative.
>>
>> The above applies to the t7 test case as well.
>>
>> And for t9 please removed the ## comments unless you feel it's
>> informative.
>>
>> Looks good otherwise.
>>
>>  Chad
>>
>> Code review: http://codereview.appspot.com/6352105/
>>
>> --
>> Alexey Samsonov, MSK
>>
>> <zdiff.pr11468>
>>
>>
>>
>>
>
>
> --
> Alexey Samsonov, MSK
>
>
>


-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120713/2a875645/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: issue6352105_6002.diff
Type: application/octet-stream
Size: 8735 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120713/2a875645/attachment.obj>


More information about the llvm-commits mailing list