[PATCH] D14971: X86: Emit smaller code for moving 8-bit immediates

Hans Wennborg via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 24 15:30:21 PST 2015


hans created this revision.
hans added reviewers: majnemer, mkuper, DavidKreitzer.
hans added a subscriber: llvm-commits.

Me and David took a look at how different compilers lower "move -1 into eax" when optimizing for size.

Clang will emit "movl $-1, %eax" (5 bytes), GCC and MSVC use "orl $-1, %eax" (3 bytes), ICC uses "pushl $-1; popl %eax" (3 bytes). A fourth alternative would be "xor %eax, %eax; decl %eax" (3 bytes).

A problem with the OR approach is that there's a dependency on the previous value in %eax. The DEC approach avoids that, but maybe DEC is slow on some micro-architectures?

ICC's PUSH/POP approach avoids the dependency problem and has the nice property that it works with all 8-bit immediates under sign extension. However, potentially touching memory seems scary, and IACA says it has a latency of 6 cycles. Is it really that slow, or is this because there's something about the stack that IACA doesn't model? I tried to micro-benchmark the difference between MOV and PUSH/POP on my machine, and the difference was in the noise.

Since ICC emits this code, it would be great if someone from Intel could comment about the size/speed trade-of here.

I'm attaching my attempt at implementing this in LLVM. Please let me know what you think. Suggestions for more reviewers is also welcome.

http://reviews.llvm.org/D14971

Files:
  lib/Target/X86/X86InstrCompiler.td
  lib/Target/X86/X86InstrInfo.cpp
  test/CodeGen/X86/mov-32imm-sext-i8.ll
  test/CodeGen/X86/movtopush.ll
  test/CodeGen/X86/powi.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D14971.41090.patch
Type: text/x-patch
Size: 6405 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151124/e4838405/attachment.bin>


More information about the llvm-commits mailing list