[PATCH] D18573: [X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325)

Tue Mar 29 16:44:57 PDT 2016

hans added inline comments.

================
Comment at: test/CodeGen/X86/win32-seh-nested-finally.ll:46
@@ -45,3 +45,3 @@
 ; CHECK: movl $1, -[[state]](%ebp)
-; CHECK: movl $1, (%esp)
+; CHECK: pushl $1
 ; CHECK: calll _f
----------------
joerg wrote:
> The changes here look suspicious, doesn't the code need to restore %esp before the final popl?
This does look weird. Reid, is there something magic about these invokes, or is mov-to-push broken here?

================
Comment at: test/CodeGen/X86/xmulo.ll:28
@@ -29,1 +27,3 @@
+; CHECK:  pushl $0
+; CHECK:  pushl $0
 
----------------
joerg wrote:
> Each immediate push is two Bytes, materializing $0 as register is two Bytes as well, but a register push is one Byte. So for three pushes, a shorter sequence is actually:
> 
> ```
> xorl %eax, %eax
> pushl %eax
> pushl %eax
> pushl %eax
> ```
Yes, we ca probably be more efficient here. See also PR26330 where we get this wrong the other way around.

================
Comment at: test/CodeGen/X86/zext-fold.ll:38
@@ -37,2 +37,3 @@
 ; CHECK: movzbl {{[0-9]+}}(%esp), [[REGISTER:%e[a-z]{2}]]
-; CHECK-NEXT: movl [[REGISTER]], 4(%esp)
+; CHECK: subl $8, %esp
+; CHECK-NEXT: pushl [[REGISTER]]
----------------
joerg wrote:
> Two register pushes are smaller than one subl?
The savings from the pushes offset the cost of the the sub.

The sub seems unnecessary though. It would be nice to not emit it.


http://reviews.llvm.org/D18573