[PATCH] D14971: X86: Emit smaller code for moving 8-bit immediates

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 30 19:23:57 PST 2015


silvas added a comment.

My guess is that ICC is preferring the push+pop because it doesn't touch flags and can so can be easily inserted anywhere.
Note that most x86 nowadays have sideband stack tracking in the instruction decoder to break dependencies between push/pop instructions; e.g. push $-1; pop %rax; push $-1; pop %rdx; won't have a dependency between the two pairs of instructions.

Still, unless intel uarches are doing something funky optimizing this in the decoder, the store forwarding latency will still make this sequence quite poor compared to xor+dec (as IACA points out).


http://reviews.llvm.org/D14971





More information about the llvm-commits mailing list