[PATCH] D18046: [X86] Providing correct unwind info in function epilogue
David Kreitzer via llvm-commits
llvm-commits at lists.llvm.org
Mon May 2 15:41:21 PDT 2016
DavidKreitzer added a comment.
I think we want to make sure that we move in a direction that makes it easier to do optimizations that affect the CFI between X86FrameLowering and this late pass. For example, we cannot schedule the pushes generated by the X86CallFrameOptimization pass without moving the CFI along with the push. So we generate very poor code in cases like this where the push operands get in the way of outgoing inreg arguments:
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"
declare i32 @f1(i32 inreg, i32 inreg, i32 inreg, i32, i32)
define i32 @f2(i32 inreg %a, i32 inreg %b, i32 inreg %c, i32 %d, i32 %e) nounwind {
entry:
%call = tail call i32 @f1(i32 inreg 1, i32 inreg 2, i32 inreg 3, i32 %a, i32 %b)
%add = add nsw i32 %call, 1
ret i32 %add
}
LLVM generates this:
f2:
pushl %edi
pushl %esi
pushl %eax
movl %edx, %esi
movl %eax, %edi
subl $8, %esp
movl $1, %eax
movl $2, %edx
movl $3, %ecx
pushl %esi
pushl %edi
calll f1
addl $16, %esp
incl %eax
addl $4, %esp
popl %esi
popl %edi
retl
icc generates much cleaner code (gcc is similar):
f2:
subl $20, %esp
movl $3, %ecx
pushl %edx
pushl %eax
movl $1, %eax
movl $2, %edx
call f1
incl %eax
addl $28, %esp
ret
We would also like the ability to accumulate the stack-cleanup "add %esp" instructions for a series of calls like this:
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"
declare void @C(i32, i32, i32, i32)
define void @F() nounwind {
entry:
tail call void @C(i32 1, i32 2, i32 3, i32 4)
tail call void @C(i32 5, i32 6, i32 7, i32 8)
tail call void @C(i32 9, i32 10, i32 11, i32 12)
ret void
}
Instead of what is currently generated
F:
subl $12, %esp
pushl $4
pushl $3
pushl $2
pushl $1
calll C
addl $16, %esp
pushl $8
pushl $7
pushl $6
pushl $5
calll C
addl $16, %esp
pushl $12
pushl $11
pushl $10
pushl $9
calll C
addl $28, %esp
retl
we can eliminate both "addl $16, %esp" instructions and bump up the last %esp adjust to "addl $60, %esp". This is simpler to do without having separate CFI & stack-adjust instructions.
To put this into a concrete proposal, I would suggest making this new pass responsible not only for generating proper epilog CFI but also for generating the CFI for simple stack adjusts. That would not only help enable optimizations like the above, but also eliminate the need for transformations that generate fixed stack adjusts to worry about also generating CFI. There have recently been at least 3 patches that added CFI for transforms involving stack adjusts (see http://reviews.llvm.org/D13767, http://reviews.llvm.org/D14021, http://reviews.llvm.org/D18246) all with their own logic for adding the CFI and deciding whether or not it's necessary.
Repository:
rL LLVM
http://reviews.llvm.org/D18046
More information about the llvm-commits
mailing list