[PATCH] [X86] A first stab at a heuristic to estimate the size impact for converting movs to pushes

Thu Feb 12 00:10:05 PST 2015

Thanks, Reid!

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:193-195
@@ +192,5 @@
+    } else {
+      // We can use pushes. First, account for the fixed costs.
+      // We'll need a add after the call.
+      Advantage -= 3;
+      // If we have to realign the stack, we'll also need and sub before
----------------
rnk wrote:
> Not if the calling convention is callee-pop. In fact, if the convention is callee-pop, using a reserved call frame requires a sub, which should give the 'mov' lowering a penalty.
> 
> Anyway, not a blocking issue, just a heuristic worth adding.
Right, will add a TODO here and get to that separately, thanks.

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:199
@@ +198,3 @@
+        Advantage -= 3;
+      // Now, for each push, we save ~3 bytes. For small constants, we actually,
+      // save more (up to 5 bytes), but 3 should be a good approximation.
----------------
rnk wrote:
> Can we motivate the 3 byte saving heuristic a bit more?
So, 3 is an average value that looks reasonable, although it may be a bit conservative.

It depends on two things:
a) What is the value being put on the stack (register,8-bit integer or >8-bit integer)
b) For a mov, what is the displacement w.r.t to %esp (0, < 7-bits, > 7-bits)
For pushes, the encoding size for the three options of (a) are 1/2/5
For mov (%esp), they are 3/7/7, for a difference of 2/5/2. But this can only happen once per call-site.
For mov k(%esp), for k < 128, which is probably the most common case, an additional byte is encoded, and they are 4/8/8, for a difference of 3/6/3.
For mov k(%esp), for k >= 128, 4 bytes are used to encode k, so we have 7/11/11. This is probably fairly rare and can be ignored.

To me, looking at the numbers, 3 seems like a good bet, unless we want to special-case each of the above options. It won't be too precise (this also doesn't factor in the potential benefit of removing a mov by folding) - but I'm not trying to be extremely precise here, I just want to avoid making "obviously wrong" decisions.

================
Comment at: test/CodeGen/X86/movtopush.ll:298-299
@@ +297,4 @@
+; NORMAL-NEXT: movl    $1, (%esp)
+; NORMAL-NEXT: movl    $2, %eax
+; NORMAL-NEXT: calll _inreg
+; NORMAL-NEXT: movl    $8, 12(%esp)
----------------
rnk wrote:
> I'd really like to be able to convert to pushes for __thiscall methods, which effectively have one inreg parameter.
I agree.
The next two things I want to do here are remove the push <fi> restrictions, and support __thiscall.

http://reviews.llvm.org/D7561

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/