[PATCH] D35340: [x86] use more shift or LEA for select-of-constants

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 13 08:45:41 PDT 2017


spatel added inline comments.


================
Comment at: test/CodeGen/X86/memcmp.ll:31
 ; X86-NOSSE-NEXT:  .LBB0_1:
-; X86-NOSSE-NEXT:    movl $1, %eax
-; X86-NOSSE-NEXT:    jne .LBB0_4
-; X86-NOSSE-NEXT:  .LBB0_3:
-; X86-NOSSE-NEXT:    xorl %eax, %eax
+; X86-NOSSE-NEXT:    movb %cl, %al
+; X86-NOSSE-NEXT:    leal -1(%eax,%eax), %eax
----------------
spatel wrote:
> zvi wrote:
> > A write to AL followed by a read from EAX may cause a partial register stall or a lesser penalty if the processor supports special 'merge register parts' micro-ops (which is also undesirable) .
> > This seems to be a recurring pattern as the tests show.
> Good point. FWIW, I think the memcmp diffs will disappear if D34904 is accepted. But given that this is a general problem, the answer might be in adjusting the x86-fixup-setcc pass? That's where the 'movzwl' is replaced by 'xorl' IIUC.
On 2nd thought, the whole point of that pass is to avoid partial reg stalls, so if there's still a problem for a CPU even with the leading xor to clear the reg, then we should just avoid this transform completely?


https://reviews.llvm.org/D35340





More information about the llvm-commits mailing list