[llvm-commits] [llvm] r158087 - in /llvm/trunk: lib/Target/X86/X86FrameLowering.cpp lib/Target/X86/X86RegisterInfo.cpp lib/Target/X86/X86RegisterInfo.h test/CodeGen/X86/alloca-align-rounding-32.ll test/CodeGen/X86/alloca-align-rounding.ll test/Co

Chandler Carruth chandlerc at google.com
Mon Jun 18 02:17:06 PDT 2012


On Fri, Jun 15, 2012 at 6:21 PM, Chad Rosier <mcrosier at apple.com> wrote:

>
> On Jun 15, 2012, at 6:07 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
> On Fri, Jun 15, 2012 at 6:06 PM, Chad Rosier <mcrosier at apple.com> wrote:
>
>> Chandler, please revert.
>>
>
> Thanks, is there anything we can do to help fix the underlying issue?
>
>
> I understand the underlying issue, but I'm not familiar enough with the
> frame handling code to point you in the right direction this very moment.
>  Essentially, the pops should be relative to the BP (RBX I believe), not
> the SP.
>
> Matt and I stared at it a bit, and it seems like there may be completely
> missing handling of the particular combination of features that this patch
> enables in a few places. If there is somewhere we can contribute patches
> to, let us know?
>
>
> I saw no CodeGen changes across the test-suite, but of course that's not
> your use case (i.e. the test-suite doesn't force stack alignment).  Finding
> the bugs are very helpful.. And test cases! :). Beyond that I don't have
> any great suggestions.
>

Ok, I've checked in a test case in r158656 to try to help keep these bits
covered by tests.

I'm not terribly satisfied with the test case, as it's fairly sensitive to
changes to any part of this stack, but hopefully it'll be easy enough to
update for whatever the new logic is, and catch the issue when doing so.

The testcase looks like this:

define i64 @g(i32 %i) {
entry:
  %0 = alloca i8, i32 %i
  call void @llvm.memset.p0i8.i32(...)
  %call = call i32 @f(...)
  %conv = sext i32 %call to i64
  ret i64 %conv
}

Which when run through 'llc -force-stack-align -stack-alignment=32' (the
best way I could think of to ensure an absurd about of stack re-alignment
took place...) has asm like so after r158087:

g:                                      # @g
# BB#0:                                 # %entry        pushl   %ebp
        movl    %esp, %ebp
        andl    $-32, %esp
        pushl   %ebx
        pushl   %esi
        subl    $24, %esp
        movl    %esp, %ebx
        movl    8(%ebx), %eax
        leal    31(%eax), %ecx
        andl    $-32, %ecx
        movl    %esp, %esi
        subl    %ecx, %esi
        movl    %esi, %esp
        subl    $32, %esp
        movl    %eax, 8(%esp)
        movl    %esi, (%esp)
        movl    $0, 4(%esp)
        calll   memset
        addl    $32, %esp
        subl    $32, %esp
        movl    %esi, (%esp)
        calll   f
        addl    $32, %esp
        movl    %eax, %edx
        sarl    $31, %edx
        addl    $24, %esp
        popl    %esi
        popl    %ebx
        movl    %ebp, %esp
        popl    %ebp
        ret


The bug is the 'addl $24, %esp' before the 'popl's. It has to reload from
%ebp before it can pop any saved registers due to the alloca.

After reverting this patch, the same command produces:

g:                                      # @g
# BB#0:                                 # %entry        pushl   %ebp
        movl    %esp, %ebp
        pushl   %esi
        subl    $20, %esp
        movl    8(%ebp), %eax
        leal    31(%eax), %ecx
        andl    $-32, %ecx
        movl    %esp, %esi
        subl    %ecx, %esi
        movl    %esi, %esp
        subl    $32, %esp
        movl    %eax, 8(%esp)
        movl    %esi, (%esp)
        movl    $0, 4(%esp)
        calll   memset
        addl    $32, %esp
        subl    $32, %esp
        movl    %esi, (%esp)
        calll   f
        addl    $32, %esp
        movl    %eax, %edx
        sarl    $31, %edx
        leal    -4(%ebp), %esp
        popl    %esi
        popl    %ebp
        ret

Here the 'leal' off of %ebp restores the frame pointer correctly before
popping registers.

Hope this is enough to help track down what the fix should be... Let me
know if not.

Also, if you have any better ideas about how to write a test case for this,
by all means chime in on my commit... I'm not really thrilled with the
regression test strategy here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120618/520d3601/attachment.html>


More information about the llvm-commits mailing list