[llvm-commits] [Review request][Win64] Use pushq/popq GPRs in function prologue/epilogue
Chris Lattner
clattner at apple.com
Sat Feb 26 18:36:04 PST 2011
Hi Takumi,
Evan said (verbally) that this looks ok to him, go ahead and commit it, thanks!
-Chris
On Feb 21, 2011, at 6:23 PM, NAKAMURA Takumi wrote:
> Ping.
>
>
> On Fri, Feb 18, 2011 at 2:02 PM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
>> Ping.
>>
>> It would also resolve potential issue if other x86-64 targets had a
>> new calling conversions to preserve XMMs.
>>
>> ...Takumi
>>
>>
>> On Mon, Feb 7, 2011 at 8:49 PM, NAKAMURA Takumi <geek4civic at gmail.com> wrote:
>>> Anton,
>>>
>>> I attached an updated patch, refined and comments added, thank you.
>>> It is at; https://github.com/chapuni/LLVM/commit/8da7419f8170e5a11063eb8e992ecb1ad40f9f6a
>>>
>>>
>>> On the github, Anton wrote;
>>>> View Commit: https://github.com/chapuni/LLVM/commit/d719f1b1ba5f01823e00904c1599adf356089a1d
>>>
>>>> This will pessimize non-win64 targets. Basically, the problem is that you cannot use pushq/popq for high xmm regs, which are callee-saved on win64-only (but not on linux / darwin, for example).
>>>
>>> I thought the patch does not affect to non-win64 targets, but it might
>>> be dubious to check whether regs are GPR or not.
>>> I rewrote checking expressions.
>>>
>>> I am sorry if I missed your point.
>>>
>>>> Another problem is function prologue emission. Have you verified that you always have proper stack frame? Even in case when high xmm regs (xmm5, etc.) are spilled?
>>>
>>> As far as spiller emits in order [pushq GPRs...] and [mov xmm to fp],
>>> and restorer emits in order [mov xmm from fp] and [popq GPRs],
>>> emitPrologue() and emitEpilogue() will emit adjusting %rsp onto proper place.
>>> And I can expect spiller can place XMMs i128 aligned.
>>>
>>> ; for example
>>> define void @foo() nounwind {
>>> entry:
>>> tail call void (...)* @bar() nounwind
>>> tail call void asm sideeffect "nop",
>>> "~{si},~{di},~{xmm13},~{xmm11},~{xmm15},~{dirflag},~{fpsr},~{flags}"()
>>> nounwind
>>> ret void
>>> }
>>> declare void @bar(...)
>>>
>>> #### -mtriple=x86_64-mingw32
>>> foo:
>>> pushq %rsi
>>> pushq %rdi
>>> subq $88, %rsp
>>> movaps %xmm15, 32(%rsp) # 16-byte Spill
>>> movaps %xmm13, 48(%rsp) # 16-byte Spill
>>> movaps %xmm11, 64(%rsp) # 16-byte Spill
>>> callq bar
>>> #APP
>>> nop
>>> #NO_APP
>>> movaps 64(%rsp), %xmm11 # 16-byte Reload
>>> movaps 48(%rsp), %xmm13 # 16-byte Reload
>>> movaps 32(%rsp), %xmm15 # 16-byte Reload
>>> addq $88, %rsp
>>> popq %rdi
>>> popq %rsi
>>> ret
>>>
>>> And also, I have checked to build clang by 3 stage.
>>> (x64-clang can build and test clang and llvm)
>>>
>>>
>>> ...Takumi
>>>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list