[PATCH] D21774: [X86] Transform setcc + movzbl into xorl + setcc
David Kreitzer via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 28 15:09:01 PDT 2016
DavidKreitzer added a comment.
Hi Michael,
Thanks for working on this! Ultimately I am okay with this approach, but I have a few high level comments.
(1) The advantage of the pseudo-SETcc instruction approach over this approach is that the dependence of the SETcc on the XOR is explicit. In your transformed code, I think there is nothing to prevent this:
v1 = MOV32r0
<CC setter>
v2 = SETcc
v3 = INSERT_SUBREG(v1, v2)
from being transformed into this
<CC setter>
v2 = SETcc
v1 = MOV32r0
v3 = INSERT_SUBREG(v1, v2)
I'd expect this to be a reasonably common occurrence, since the RA might choose to regenerate the MOV32r0 at its use to save on register pressure. (Regenerating the MOV32r0 eliminates the register conflicts with the operands of the CC setter.) And this would, ironically, have the effect of using an extra register & an extra movb instruction since v1 and v2 could no longer be assigned the same register.
(2) Do you have any performance data on the patch?
(3) There is some advantage to generating the XOR/SETcc idiom even for SETcc instructions that don't get zero extended just for the purpose of eliminating the false dependence. I know that carries a code size cost and has some performance risk, but it is at least worth experimenting with. That's something for later, though. We should go after the "easy" cases first like you've done in this patch. We might also choose to handle these harder cases as part of the ExecutionDepsFix pass and not in this new pass.
Thanks,
Dave
================
Comment at: lib/Target/X86/X86FixupSetCC.cpp:1
@@ +1,2 @@
+//===---- X86FixupSetCC.cpp - optimize usage of LEA instructions ----------===//
+//
----------------
Please fix the cut & paste error here.
================
Comment at: test/CodeGen/X86/fp128-compare.ll:11
@@ -10,3 +10,3 @@
; CHECK: callq __gttf2
-; CHECK: setg %al
-; CHECK: movzbl %al, %eax
+; CHECK: xorl %ecx, %ecx
+; CHECK: setg %cl
----------------
This looks like a problem ...
http://reviews.llvm.org/D21774
More information about the llvm-commits
mailing list