[PATCH] D21774: [X86] Transform setcc + movzbl into xorl + setcc

David Kreitzer via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 28 15:09:01 PDT 2016

DavidKreitzer added a comment.

Hi Michael,

Thanks for working on this! Ultimately I am okay with this approach, but I have a few high level comments.

(1) The advantage of the pseudo-SETcc instruction approach over this approach is that the dependence of the SETcc on the XOR is explicit. In your transformed code, I think there is nothing to prevent this:

  v1 = MOV32r0
  <CC setter>
  v2 = SETcc
  v3 = INSERT_SUBREG(v1, v2)

from being transformed into this

  <CC setter>
  v2 = SETcc
  v1 = MOV32r0
  v3 = INSERT_SUBREG(v1, v2)

I'd expect this to be a reasonably common occurrence, since the RA might choose to regenerate the MOV32r0 at its use to save on register pressure. (Regenerating the MOV32r0 eliminates the register conflicts with the operands of the CC setter.) And this would, ironically, have the effect of using an extra register & an extra movb instruction since v1 and v2 could no longer be assigned the same register.

(2) Do you have any performance data on the patch?

(3) There is some advantage to generating the XOR/SETcc idiom even for SETcc instructions that don't get zero extended just for the purpose of eliminating the false dependence. I know that carries a code size cost and has some performance risk, but it is at least worth experimenting with. That's something for later, though. We should go after the "easy" cases first like you've done in this patch. We might also choose to handle these harder cases as part of the ExecutionDepsFix pass and not in this new pass.


Comment at: lib/Target/X86/X86FixupSetCC.cpp:1
@@ +1,2 @@
+//===---- X86FixupSetCC.cpp - optimize usage of LEA instructions ----------===//
Please fix the cut & paste error here.

Comment at: test/CodeGen/X86/fp128-compare.ll:11
@@ -10,3 +10,3 @@
 ; CHECK:       callq __gttf2
-; CHECK:       setg  %al
-; CHECK:       movzbl %al, %eax
+; CHECK:       xorl  %ecx, %ecx
+; CHECK:       setg  %cl
This looks like a problem ...


More information about the llvm-commits mailing list