[llvm-commits] [Review request][Win64] Use pushq/popq GPRs in function prologue/epilogue

NAKAMURA Takumi geek4civic at gmail.com
Mon Feb 21 18:23:10 PST 2011


Ping.


On Fri, Feb 18, 2011 at 2:02 PM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
> Ping.
>
> It would also resolve potential issue if other x86-64 targets had a
> new calling conversions to preserve XMMs.
>
> ...Takumi
>
>
> On Mon, Feb 7, 2011 at 8:49 PM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
>> Anton,
>>
>> I attached an updated patch, refined and comments added, thank you.
>> It is at; https://github.com/chapuni/LLVM/commit/8da7419f8170e5a11063eb8e992ecb1ad40f9f6a
>>
>>
>> On the github, Anton wrote;
>>> View Commit: https://github.com/chapuni/LLVM/commit/d719f1b1ba5f01823e00904c1599adf356089a1d
>>
>>> This will pessimize non-win64 targets. Basically, the problem is that you cannot use pushq/popq for high xmm regs, which are callee-saved on win64-only (but not on linux / darwin, for example).
>>
>> I thought the patch does not affect to non-win64 targets, but it might
>> be dubious to check whether regs are GPR or not.
>> I rewrote checking expressions.
>>
>> I am sorry if I missed your point.
>>
>>> Another problem is function prologue emission. Have you verified that you always have proper stack frame? Even in case when high xmm regs (xmm5, etc.) are spilled?
>>
>> As far as spiller emits in order [pushq GPRs...] and [mov xmm to fp],
>> and restorer emits in order [mov xmm from fp] and [popq GPRs],
>> emitPrologue() and emitEpilogue() will emit adjusting %rsp onto proper place.
>> And I can expect spiller can place XMMs i128 aligned.
>>
>> ; for example
>> define void @foo() nounwind {
>> entry:
>>  tail call void (...)* @bar() nounwind
>>  tail call void asm sideeffect "nop",
>> "~{si},~{di},~{xmm13},~{xmm11},~{xmm15},~{dirflag},~{fpsr},~{flags}"()
>> nounwind
>>  ret void
>> }
>> declare void @bar(...)
>>
>> #### -mtriple=x86_64-mingw32
>> foo:
>>        pushq   %rsi
>>        pushq   %rdi
>>        subq    $88, %rsp
>>        movaps  %xmm15, 32(%rsp)        # 16-byte Spill
>>        movaps  %xmm13, 48(%rsp)        # 16-byte Spill
>>        movaps  %xmm11, 64(%rsp)        # 16-byte Spill
>>        callq   bar
>>        #APP
>>        nop
>>        #NO_APP
>>        movaps  64(%rsp), %xmm11        # 16-byte Reload
>>        movaps  48(%rsp), %xmm13        # 16-byte Reload
>>        movaps  32(%rsp), %xmm15        # 16-byte Reload
>>        addq    $88, %rsp
>>        popq    %rdi
>>        popq    %rsi
>>        ret
>>
>> And also, I have checked to build clang by 3 stage.
>> (x64-clang can build and test clang and llvm)
>>
>>
>> ...Takumi
>>
>




More information about the llvm-commits mailing list